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Preface  for  the  Instructor 


You  are  about  to  teach  a  course  that  will  probably  give  students  their  second 
exposure  to  linear  algebra.  During  their  first  brush  with  the  subject,  your 
students  probably  worked  with  Euclidean  spaces  and  matrices.  In  contrast, 
this  course  will  emphasize  abstract  vector  spaces  and  linear  maps. 

The  audacious  title  of  this  book  deserves  an  explanation.  Almost  all 
linear  algebra  books  use  determinants  to  prove  that  every  linear  operator  on 
a  finite-dimensional  complex  vector  space  has  an  eigenvalue.  Determinants 
are  difficult,  nonintuitive,  and  often  defined  without  motivation.  To  prove  the 
theorem  about  existence  of  eigenvalues  on  complex  vector  spaces,  most  books 
must  define  determinants,  prove  that  a  linear  map  is  not  invertible  if  and  only 
if  its  determinant  equals  0,  and  then  define  the  characteristic  polynomial.  This 
tortuous  (torturous?)  path  gives  students  little  feeling  for  why  eigenvalues 
exist. 

In  contrast,  the  simple  determinant-free  proofs  presented  here  (for  example, 
see  5.21)  offer  more  insight.  Once  determinants  have  been  banished  to  the 
end  of  the  book,  a  new  route  opens  to  the  main  goal  of  linear  algebra — 
understanding  the  structure  of  linear  operators. 

This  book  starts  at  the  beginning  of  the  subject,  with  no  prerequisites 
other  than  the  usual  demand  for  suitable  mathematical  maturity.  Even  if  your 
students  have  already  seen  some  of  the  material  in  the  first  few  chapters,  they 
may  be  unaccustomed  to  working  exercises  of  the  type  presented  here,  most 
of  which  require  an  understanding  of  proofs. 

Here  is  a  chapter-by-chapter  summary  of  the  highlights  of  the  book: 

•  Chapter  1 :  Vector  spaces  are  defined  in  this  chapter,  and  their  basic  proper¬ 
ties  are  developed. 

•  Chapter  2:  Linear  independence,  span,  basis,  and  dimension  are  defined  in 
this  chapter,  which  presents  the  basic  theory  of  finite-dimensional  vector 
spaces. 
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•  Chapter  3:  Linear  maps  are  introduced  in  this  chapter.  The  key  result  here 
is  the  Fundamental  Theorem  of  Linear  Maps  (3.22):  if  T  is  a  linear  map 
on  V,  then  dim  V  =  dim  null  T  +  dim  range  T.  Quotient  spaces  and  duality 
are  topics  in  this  chapter  at  a  higher  level  of  abstraction  than  other  parts 
of  the  book;  these  topics  can  be  skipped  without  running  into  problems 
elsewhere  in  the  book. 

•  Chapter  4:  The  part  of  the  theory  of  polynomials  that  will  be  needed 
to  understand  linear  operators  is  presented  in  this  chapter.  This  chapter 
contains  no  linear  algebra.  It  can  be  covered  quickly,  especially  if  your 
students  are  already  familiar  with  these  results. 

•  Chapter  5:  The  idea  of  studying  a  linear  operator  by  restricting  it  to  small 
subspaces  leads  to  eigenvectors  in  the  early  part  of  this  chapter.  The 
highlight  of  this  chapter  is  a  simple  proof  that  on  complex  vector  spaces, 
eigenvalues  always  exist.  This  result  is  then  used  to  show  that  each  linear 
operator  on  a  complex  vector  space  has  an  upper-triangular  matrix  with 
respect  to  some  basis.  All  this  is  done  without  defining  determinants  or 
characteristic  polynomials! 

•  Chapter  6:  Inner  product  spaces  are  defined  in  this  chapter,  and  their  basic 
properties  are  developed  along  with  standard  tools  such  as  orthonormal 
bases  and  the  Gram-Schmidt  Procedure.  This  chapter  also  shows  how 
orthogonal  projections  can  be  used  to  solve  certain  minimization  problems. 

•  Chapter  7:  The  Spectral  Theorem,  which  characterizes  the  linear  operators 
for  which  there  exists  an  orthonormal  basis  consisting  of  eigenvectors, 
is  the  highlight  of  this  chapter.  The  work  in  earlier  chapters  pays  off 
here  with  especially  simple  proofs.  This  chapter  also  deals  with  positive 
operators,  isometries,  the  Polar  Decomposition,  and  the  Singular  Value 
Decomposition. 

•  Chapter  8:  Minimal  polynomials,  characteristic  polynomials,  and  gener¬ 
alized  eigenvectors  are  introduced  in  this  chapter.  The  main  achievement 
of  this  chapter  is  the  description  of  a  linear  operator  on  a  complex  vector 
space  in  terms  of  its  generalized  eigenvectors.  This  description  enables 
one  to  prove  many  of  the  results  usually  proved  using  Jordan  Form.  For 
example,  these  tools  are  used  to  prove  that  every  invertible  linear  operator 
on  a  complex  vector  space  has  a  square  root.  The  chapter  concludes  with  a 
proof  that  every  linear  operator  on  a  complex  vector  space  can  be  put  into 
Jordan  Form. 
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•  Chapter  9:  Linear  operators  on  real  vector  spaces  occupy  center  stage  in 
this  chapter.  Here  the  main  technique  is  complexification,  which  is  a  natural 
extension  of  an  operator  on  a  real  vector  space  to  an  operator  on  a  complex 
vector  space.  Complexification  allows  our  results  about  complex  vector 
spaces  to  be  transferred  easily  to  real  vector  spaces.  For  example,  this 
technique  is  used  to  show  that  every  linear  operator  on  a  real  vector  space 
has  an  invariant  subspace  of  dimension  1  or  2.  As  another  example,  we 
show  that  that  every  linear  operator  on  an  odd-dimensional  real  vector  space 
has  an  eigenvalue. 

•  Chapter  10:  The  trace  and  determinant  (on  complex  vector  spaces)  are 
defined  in  this  chapter  as  the  sum  of  the  eigenvalues  and  the  product  of  the 
eigenvalues,  both  counting  multiplicity.  These  easy-to-remember  defini¬ 
tions  would  not  be  possible  with  the  traditional  approach  to  eigenvalues, 
because  the  traditional  method  uses  determinants  to  prove  that  sufficient 
eigenvalues  exist.  The  standard  theorems  about  determinants  now  become 
much  clearer.  The  Polar  Decomposition  and  the  Real  Spectral  Theorem  are 
used  to  derive  the  change  of  variables  formula  for  multivariable  integrals  in 
a  fashion  that  makes  the  appearance  of  the  determinant  there  seem  natural. 

This  book  usually  develops  linear  algebra  simultaneously  for  real  and 
complex  vector  spaces  by  letting  F  denote  either  the  real  or  the  complex 
numbers.  If  you  and  your  students  prefer  to  think  of  F  as  an  arbitrary  field, 
then  see  the  comments  at  the  end  of  Section  l.A.  I  prefer  avoiding  arbitrary 
fields  at  this  level  because  they  introduce  extra  abstraction  without  leading 
to  any  new  linear  algebra.  Also,  students  are  more  comfortable  thinking 
of  polynomials  as  functions  instead  of  the  more  formal  objects  needed  for 
polynomials  with  coefficients  in  finite  fields.  Finally,  even  if  the  beginning 
part  of  the  theory  were  developed  with  arbitrary  fields,  inner  product  spaces 
would  push  consideration  back  to  just  real  and  complex  vector  spaces. 

You  probably  cannot  cover  everything  in  this  book  in  one  semester.  Going 
through  the  first  eight  chapters  is  a  good  goal  for  a  one- semester  course.  If 
you  must  reach  Chapter  10,  then  consider  covering  Chapters  4  and  9  in  fifteen 
minutes  each,  as  well  as  skipping  the  material  on  quotient  spaces  and  duality 
in  Chapter  3. 

A  goal  more  important  than  teaching  any  particular  theorem  is  to  develop  in 
students  the  ability  to  understand  and  manipulate  the  objects  of  linear  algebra. 
Mathematics  can  be  learned  only  by  doing.  Fortunately,  linear  algebra  has 
many  good  homework  exercises.  When  teaching  this  course,  during  each 
class  I  usually  assign  as  homework  several  of  the  exercises,  due  the  next  class. 
Going  over  the  homework  might  take  up  a  third  or  even  half  of  a  typical  class. 
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Major  changes  from  the  previous  edition: 

•  This  edition  contains  561  exercises,  including  337  new  exercises  that  were 
not  in  the  previous  edition.  Exercises  now  appear  at  the  end  of  each  section, 
rather  than  at  the  end  of  each  chapter. 

•  Many  new  examples  have  been  added  to  illustrate  the  key  ideas  of  linear 
algebra. 

•  Beautiful  new  formatting,  including  the  use  of  color,  creates  pages  with  an 
unusually  pleasant  appearance  in  both  print  and  electronic  versions.  As  a 
visual  aid,  definitions  are  in  beige  boxes  and  theorems  are  in  blue  boxes  (in 
color  versions  of  the  book). 

•  Each  theorem  now  has  a  descriptive  name. 

•  New  topics  covered  in  the  book  include  product  spaces,  quotient  spaces, 
and  duality. 

•  Chapter  9  (Operators  on  Real  Vector  Spaces)  has  been  completely  rewritten 
to  take  advantage  of  simplifications  via  complexification.  This  approach 
allows  for  more  streamlined  presentations  in  Chapters  5  and  7  because 
those  chapters  now  focus  mostly  on  complex  vector  spaces. 

•  Hundreds  of  improvements  have  been  made  throughout  the  book.  For 
example,  the  proof  of  Jordan  Form  (Section  8.D)  has  been  simplified. 

Please  check  the  website  below  for  additional  information  about  the  book.  I 
may  occasionally  write  new  sections  on  additional  topics.  These  new  sections 
will  be  posted  on  the  website.  Your  suggestions,  comments,  and  corrections 
are  most  welcome. 

Best  wishes  for  teaching  a  successful  linear  algebra  class! 

Sheldon  Axler 
Mathematics  Department 
San  Francisco  State  University 
San  Francisco,  CA  94132,  USA 

website:  linear.axler.net 
e-mail:  linear@axler.net 
Twitter:  @AxlerLinear 
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You  are  probably  about  to  begin  your  second  exposure  to  linear  algebra.  Unlike 
your  first  brush  with  the  subject,  which  probably  emphasized  Euclidean  spaces 
and  matrices,  this  encounter  will  focus  on  abstract  vector  spaces  and  linear 
maps.  These  terms  will  be  defined  later,  so  don’t  worry  if  you  do  not  know 
what  they  mean.  This  book  starts  from  the  beginning  of  the  subject,  assuming 
no  knowledge  of  linear  algebra.  The  key  point  is  that  you  are  about  to 
immerse  yourself  in  serious  mathematics,  with  an  emphasis  on  attaining  a 
deep  understanding  of  the  definitions,  theorems,  and  proofs. 

You  cannot  read  mathematics  the  way  you  read  a  novel.  If  you  zip  through  a 
page  in  less  than  an  hour,  you  are  probably  going  too  fast.  When  you  encounter 
the  phrase  “as  you  should  verify”,  you  should  indeed  do  the  verification,  which 
will  usually  require  some  writing  on  your  part.  When  steps  are  left  out,  you 
need  to  supply  the  missing  pieces.  You  should  ponder  and  internalize  each 
definition.  For  each  theorem,  you  should  seek  examples  to  show  why  each 
hypothesis  is  necessary.  Discussions  with  other  students  should  help. 

As  a  visual  aid,  definitions  are  in  beige  boxes  and  theorems  are  in  blue 
boxes  (in  color  versions  of  the  book).  Each  theorem  has  a  descriptive  name. 

Please  check  the  website  below  for  additional  information  about  the  book.  I 
may  occasionally  write  new  sections  on  additional  topics.  These  new  sections 
will  be  posted  on  the  website.  Your  suggestions,  comments,  and  corrections 
are  most  welcome. 

Best  wishes  for  success  and  enjoyment  in  learning  linear  algebra! 

Sheldon  Axler 
Mathematics  Department 
San  Francisco  State  University 
San  Francisco,  CA  94132,  USA 

website:  linear.axler.net 
e-mail:  linear@axler.net 
Twitter:  @AxlerLinear 
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Rene  Descartes  explaining  his 
work  to  Queen  Christina  of 
Sweden.  Vector  spaces  are  a 
generalization  of  the 
description  of  a  plane  using 
two  coordinates,  as  published 
by  Descartes  in  1637. 


Vector  Spaces 


Linear  algebra  is  the  study  of  linear  maps  on  finite-dimensional  vector  spaces. 
Eventually  we  will  learn  what  all  these  terms  mean.  In  this  chapter  we  will 
define  vector  spaces  and  discuss  their  elementary  properties. 

In  linear  algebra,  better  theorems  and  more  insight  emerge  if  complex 
numbers  are  investigated  along  with  real  numbers.  Thus  we  will  begin  by 
introducing  the  complex  numbers  and  their  basic  properties. 

We  will  generalize  the  examples  of  a  plane  and  ordinary  space  to  Rn 
and  Cn ,  which  we  then  will  generalize  to  the  notion  of  a  vector  space.  The 
elementary  properties  of  a  vector  space  will  already  seem  familiar  to  you. 

Then  our  next  topic  will  be  subspaces,  which  play  a  role  for  vector  spaces 
analogous  to  the  role  played  by  subsets  for  sets.  Finally,  we  will  look  at  sums 
of  subspaces  (analogous  to  unions  of  subsets)  and  direct  sums  of  subspaces 
(analogous  to  unions  of  disjoint  sets). 

LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  basic  properties  of  the  complex  numbers 

■  Rn  and  Cn 

■  vector  spaces 

■  subspaces 

■  sums  and  direct  sums  of  subspaces 
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Rn  and  Cn 


Complex  Numbers 

You  should  already  be  familiar  with  basic  properties  of  the  set  R  of  real 
numbers.  Complex  numbers  were  invented  so  that  we  can  take  square  roots  of 
negative  numbers.  The  idea  is  to  assume  we  have  a  square  root  of  —1,  denoted 
/,  that  obeys  the  usual  rules  of  arithmetic.  Here  are  the  formal  definitions: 

1.1  Definition  complex  numbers 

•  A  complex  number  is  an  ordered  pair  (< a,b ),  where  a,  b  G  R,  but 
we  will  write  this  as  a  +  hi. 

•  The  set  of  all  complex  numbers  is  denoted  by  C: 

C  =  {a  +  bi  :  a,b  G  R}. 

•  Addition  and  multiplication  on  C  are  defined  by 

( a  +  bi')  +  {c  +  d i )  =  ( a  +  c )  +  ( b  +  dfi , 

(a  +  bi)(c  +  di)  =  ( ac  —  bd )  +  {ad  +  bc)i ; 

here  a ,  b,c,d  e  R. 

If  a  G  R,  we  identify  a  +  0/  with  the  real  number  a.  Thus  we  can  think 
of  R  as  a  subset  of  C.  We  also  usually  write  0  +  bi  as  just  bi ,  and  we  usually 
write  0+1/  as  just  /. 

Using  multiplication  as  defined 
above,  you  should  verify  that  i2  =  —  1. 
Do  not  memorize  the  formula  for  the 
product  of  two  complex  numbers;  you 
can  always  rederive  it  by  recalling  that 
i2  —  —  1  and  then  using  the  usual  rules 
of  arithmetic  (as  given  by  1.3). 


The  symbol  i  was  first  used  to  de¬ 
note  V— T  by  Swiss  mathematician 
Leonhard  Euler  in  1777. 


l.A 


1 .2  Example  Evaluate  (2  +  3z)(4  +  5 i). 

Solution  (2  +  3f)(4  +  5/)  =  2  •  4  +  2  •  (5/)  +  (3/)  •  4  +  (3z)(5z) 

=  8  +  10/  +  12/  -  15 
=  -7  +  22/ 
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i  .3  Properties  of  complex  arithmetic 
commutativity 

a  +  P  =  P  +  a  and  aP  =  Pa  for  all  a,  P  e  C; 

associativity 

(a?  +  /3)  +  A  =  a  +  (/3  +  A)  and  (aP)X  =  a(pX)  for  alia,  p,  X  e  C; 

identities 

X  +  0  =  A  and  A1  =  A  for  all  X  e  C; 

additive  inverse 

for  every  a  e  C,  there  exists  a  unique  /?  e  C  such  that  a  +  /3  =  0; 

multiplicative  inverse 

for  every  aeC  with  a  /0,  there  exists  a  unique  e  C  such  that 
aft  —  1; 


distributive  property 

A(a  +  /})  =  Aa  +  X/3  for  all  A,  a,  /3  e  C. 

The  properties  above  are  proved  using  the  familiar  properties  of  real 
numbers  and  the  definitions  of  complex  addition  and  multiplication.  The 
next  example  shows  how  commutativity  of  complex  multiplication  is  proved. 
Proofs  of  the  other  properties  above  are  left  as  exercises. 

1 .4  Example  Show  that  a/3  =  fia  for  all  a,  /?,  A  e  C. 

Solution  Suppose  a  =  a  +  bi  and  /3  =  c  +  di,  where  a ,  b,c,d  e  R.  Then 
the  definition  of  multiplication  of  complex  numbers  shows  that 

a/3  =  (a  +  bi)(c  +  di) 

=  ( ac  —  bd )  +  {ad  +  bc)i 


and 


=  (c  +  di)  (a  +  bi) 

=  {ca  —  db )  +  ( cb  +  da)i. 

The  equations  above  and  the  commutativity  of  multiplication  and  addition  of 
real  numbers  show  that  aft  =  Pa. 
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1.5  Definition  —a,  subtraction,  1/ a,  division 
Let  a,  /3  e  C. 

•  Let  —a  denote  the  additive  inverse  of  a.  Thus  —a  is  the  unique 
complex  number  such  that 

a  +  (—a)  =  0. 

•  Subtraction  on  C  is  defined  by 

/3  —  a  =  j3  +  (—o'). 

•  For  a  7^  0,  let  1  /a  denote  the  multiplicative  inverse  of  a.  Thus  l /a 
is  the  unique  complex  number  such  that 

a(l/a)  =  1. 

•  Division  on  C  is  defined  by 

/3/a  =  /3(l/a). 

So  that  we  can  conveniently  make  definitions  and  prove  theorems  that 
apply  to  both  real  and  complex  numbers,  we  adopt  the  following  notation: 

1.6  Notation  F 

Throughout  this  book,  F  stands  for  either  R  or  C. 

Thus  if  we  prove  a  theorem  involving 
F,  we  will  know  that  it  holds  when  F  is 
replaced  with  R  and  when  F  is  replaced 
with  C. 

Elements  of  F  are  called  scalars .  The  word  “scalar”,  a  fancy  word  for 
“number”,  is  often  used  when  we  want  to  emphasize  that  an  object  is  a  number, 
as  opposed  to  a  vector  (vectors  will  be  defined  soon). 

For  a  G  F  and  m  a  positive  integer,  we  define  am  to  denote  the  product  of 
a  with  itself  m  times: 

m 

a  =  . 

m  times 

Clearly  ( am)n  =  amn  and  (a/3)m  =  am /3m  for  all  a,  /3  e  F  and  all  positive 
integers  m ,  n . 


The  letter  F  is  used  because  R  and 
C  are  examples  of  what  are  called 

fields. 
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Lists 

Before  defining  Rn  and  Cn,  we  look  at  two  important  examples. 

1 .7  Example  R2  and  R3 

•  The  set  R2,  which  you  can  think  of  as  a  plane,  is  the  set  of  all  ordered 
pairs  of  real  numbers: 

R2  =  {(x,  y)  :  x,y  e  R}. 

•  The  set  R3,  which  you  can  think  of  as  ordinary  space,  is  the  set  of  all 
ordered  triples  of  real  numbers: 

R3  =  {(x,  y,  z)  :  x,  y,z  e  R}. 

To  generalize  R2  and  R3  to  higher  dimensions,  we  first  need  to  discuss  the 
concept  of  lists. 


1.8  Definition  list,  length 

Suppose  n  is  a  nonnegative  integer.  A  list  of  length  n  is  an  ordered 
collection  of  n  elements  (which  might  be  numbers,  other  lists,  or  more 
abstract  entities)  separated  by  commas  and  surrounded  by  parentheses.  A 
list  of  length  n  looks  like  this: 

(x  i , . . . ,  Xu ) . 

Two  lists  are  equal  if  and  only  if  they  have  the  same  length  and  the  same 
elements  in  the  same  order. 


Thus  a  list  of  length  2  is  an  ordered  Many  mathematicians  call  a  list  of 
pair,  and  a  list  of  length  3  is  an  ordered  length  n  an  n-tuple. 
triple. 

Sometimes  we  will  use  the  word  list  without  specifying  its  length.  Re¬ 
member,  however,  that  by  definition  each  list  has  a  finite  length  that  is  a 
nonnegative  integer.  Thus  an  object  that  looks  like 


(x  1 ,  X2  j  •  •  •  ) 


which  might  be  said  to  have  infinite  length,  is  not  a  list. 

A  list  of  length  0  looks  like  this:  ( ).  We  consider  such  an  object  to  be  a 
list  so  that  some  of  our  theorems  will  not  have  trivial  exceptions. 

Lists  differ  from  sets  in  two  ways:  in  lists,  order  matters  and  repetitions 
have  meaning;  in  sets,  order  and  repetitions  are  irrelevant. 
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1 .9  Example  lists  versus  sets 

•  The  lists  (3,  5)  and  (5,  3)  are  not  equal,  but  the  sets  {3,  5}  and  {5,  3}  are 
equal. 

•  The  lists  (4,  4)  and  (4,4,4)  are  not  equal  (they  do  not  have  the  same 
length),  although  the  sets  {4,  4}  and  {4, 4,  4}  both  equal  the  set  {4}. 


Fn 

To  define  the  higher-dimensional  analogues  of  R2  and  R3,  we  will  simply 
replace  R  with  F  (which  equals  R  or  C)  and  replace  theFana  2  or  3  with  an 
arbitrary  positive  integer.  Specifically,  fix  a  positive  integer  n  for  the  rest  of 
this  section. 

i.io  Definition  Fw 

Fn  is  the  set  of  all  lists  of  length  n  of  elements  of  F : 

Fn  =  {(xi, . . . ,  xn)  :  Xj  g  F  for  j  =  1, . . . ,  n). 

For  (xi, . . . , xn )  G  Fn  and  j  G  {1, we  say  that  xj  is  the  j th 
coordinate  of  (xi 

9***9  FI  ^  * 


If  F  =  R  and  n  equals  2  or  3,  then  this  definition  of  Fw  agrees  with  our 
previous  notions  of  R2  and  R3. 


1.11  Example  C4  is  the  set  of  all  lists  of  four  complex  numbers: 

C4  =  {(Zi,Z2,Z3,Z4)  :  Zi,Z2,Z3,Z4  G  C}. 


For  an  amusing  account  of  how 
R3  would  be  perceived  by  crea¬ 
tures  living  in  R2,  read  Flatland: 
A  Romance  of  Many  Dimensions , 
by  Edwin  A.  Abbott.  This  novel, 
published  in  1884,  may  help  you 
imagine  a  physical  space  of  four  or 
more  dimensions. 


If  n  >4,  we  cannot  visualize  Rn 
as  a  physical  object.  Similarly,  C1  can 
be  thought  of  as  a  plane,  but  for  n  >  2, 
the  human  brain  cannot  provide  a  full 
image  of  Cn.  However,  even  if  n  is 
large,  we  can  perform  algebraic  manip¬ 
ulations  in  Fw  as  easily  as  in  R2  or  R3 . 
For  example,  addition  in  Fn  is  defined 
as  follows: 
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1.12  Definition  addition  in  Fn 

Addition  in  Fn  is  defined  by  adding  corresponding  coordinates: 

+  (yi,...,yn)  =  (x\  +  yi,...,xn  +  yn)- 

Often  the  mathematics  of  F72  becomes  cleaner  if  we  use  a  single  letter  to 
denote  a  list  of  n  numbers,  without  explicitly  writing  the  coordinates.  For 
example,  the  result  below  is  stated  with  x  and  y  in  Fn  even  though  the  proof 
requires  the  more  cumbersome  notation  of  (x\ 

9***9  )  and  (yi, . .  .,yn)- 

1.13  Commutativity  of  addition  in  F72 

If  x,  y  e  F72,  then  x  +  y  =  y  +  x. 

Proof  Suppose  x  =  (xi , . . . ,  xn)  and  y  =  (y \ , . . . ,  yn).  Then 

X  +  y  =  (xi , . . . ,  xn)  +  (y  i ,  •  •  • ,  yn) 

=  Oi  +  yi,  •  •  •  ,xn  +  yn) 

=  (yi  +Xi,...,yn  +Xn) 

—  (y\ » •  •  • » yn)  H-  (^1 »  •  •  •  >  %n) 

=  y  +  x, 

where  the  second  and  fourth  equalities  above  hold  because  of  the  definition  of 
addition  in  Fn  and  the  third  equality  holds  because  of  the  usual  commutativity 
of  addition  in  F.  ■ 

If  a  single  letter  is  used  to  denote 
an  element  of  Fn,  then  the  same  letter 
with  appropriate  subscripts  is  often  used 
when  coordinates  must  be  displayed.  For  example,  if  x  e  Fn,  then  letting  x 
equal  (xi , . . . ,  xn)  is  good  notation,  as  shown  in  the  proof  above.  Even  better, 
work  with  just  x  and  avoid  explicit  coordinates  when  possible. 

1.14  Definition  0 

Let  0  denote  the  list  of  length  n  whose  coordinates  are  all  0: 

0  =  (0, ...  ,0). 


The  symbol  u  means  uend  of  the 
proof”. 
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Here  we  are  using  the  symbol  0  in  two  different  ways — on  the  left  side  of  the 
equation  in  1.14,  the  symbol  0  denotes  a  list  of  length  n,  whereas  on  the  right 
side,  each  0  denotes  a  number.  This  potentially  confusing  practice  actually 
causes  no  problems  because  the  context  always  makes  clear  what  is  intended. 


1.15  Example  Consider  the  statement  that  0  is  an  additive  identity  for  Fn : 

x  +  0  =  x  for  all  xgF”. 

Is  the  0  above  the  number  0  or  the  list  0? 

Solution  Here  0  is  a  list,  because  we  have  not  defined  the  sum  of  an  element 
of  Fn  (namely,  x)  and  the  number  0. 


Elements  of  R2  can  be 
thought  of  as  points 
or  as  vectors. 


Mathematical  models  of  the  econ¬ 
omy  can  have  thousands  of  vari¬ 
ables,  say  X\, ... ,  X5ooo>  which 
means  that  we  must  operate  in 
r5°00  Such  a  space  cannot  be 
dealt  with  geometrically.  However, 
the  algebraic  approach  works  well. 
Thus  our  subject  is  called  linear 
algebra. 

A  picture  can  aid  our  intuition.  We 
will  draw  pictures  in  R2  because  we 
can  sketch  this  space  on  2-dimensional 
surfaces  such  as  paper  and  blackboards. 
A  typical  element  of  R2  is  a  point  x  = 
(xi ,  X2).  Sometimes  we  think  of  x  not 
as  a  point  but  as  an  arrow  starting  at  the 
origin  and  ending  at  (xi ,  X2),  as  shown 
here.  When  we  think  of  x  as  an  arrow, 
we  refer  to  it  as  a  vector. 

When  we  think  of  vectors  in  R2  as 
arrows,  we  can  move  an  arrow  parallel 
to  itself  (not  changing  its  length  or  di¬ 
rection)  and  still  think  of  it  as  the  same 
vector.  With  that  viewpoint,  you  will 
often  gain  better  understanding  by  dis¬ 
pensing  with  the  coordinate  axes  and 
the  explicit  coordinates  and  just  think¬ 
ing  of  the  vector,  as  shown  here. 

Whenever  we  use  pictures  in  R2 
or  use  the  somewhat  vague  language 
of  points  and  vectors,  remember  that 
these  are  just  aids  to  our  understand¬ 
ing,  not  substitutes  for  the  actual  math¬ 
ematics  that  we  will  develop.  Although 
we  cannot  draw  good  pictures  in  high¬ 
dimensional  spaces,  the  elements  of 
these  spaces  are  as  rigorously  defined 
as  elements  of  R2. 
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For  example,  (2,-3,  17,  n,  \/2)  is  an  element  of  R5,  and  we  may  casually 
refer  to  it  as  a  point  in  R5  or  a  vector  in  R5  without  worrying  about  whether 
the  geometry  of  R5  has  any  physical  meaning. 

Recall  that  we  defined  the  sum  of  two  elements  of  Fn  to  be  the  element  of 
Fw  obtained  by  adding  corresponding  coordinates;  see  1.12.  As  we  will  now 
see,  addition  has  a  simple  geometric  interpretation  in  the  special  case  of  R2. 

Suppose  we  have  two  vectors  x  and 
y  in  R2  that  we  want  to  add.  Move 
the  vector  y  parallel  to  itself  so  that  its 
initial  point  coincides  with  the  end  point 
of  the  vector  x,  as  shown  here.  The 
sum  x  +  y  then  equals  the  vector  whose 
initial  point  equals  the  initial  point  of 
x  and  whose  end  point  equals  the  end 
point  of  the  vector  y,  as  shown  here. 

In  the  next  definition,  the  0  on  the  right  side  of  the  displayed  equation 
below  is  the  list  OeF”. 


The  sum  of  two  vectors. 


1.16  Definition  additive  inverse  in  Fw 

For  x  G  Fn,  the  additive  inverse  of  x,  denoted  — x,  is  the  vector  — x  e  Fn 
such  that 

x  +  (— x)  =  0. 

In  other  words,  if  x  =  (xi , . . . ,  xn),  then  — x  =  (— x\ , . . . ,  —  xn). 


A  vector  and  its  additive  inverse. 


For  a  vector  xeR2,  the  additive  in¬ 
verse  —x  is  the  vector  parallel  to  x  and 
with  the  same  length  as  x  but  pointing  in 
the  opposite  direction.  The  figure  here 
illustrates  this  way  of  thinking  about  the 
additive  inverse  in  R2. 

Having  dealt  with  addition  in  Fw ,  we 
now  turn  to  multiplication.  We  could 

define  a  multiplication  in  Fw  in  a  similar  fashion,  starting  with  two  elements 
of  Fw  and  getting  another  element  of  Fn  by  multiplying  corresponding  coor¬ 
dinates.  Experience  shows  that  this  definition  is  not  useful  for  our  purposes. 
Another  type  of  multiplication,  called  scalar  multiplication,  will  be  central 
to  our  subject.  Specifically,  we  need  to  define  what  it  means  to  multiply  an 
element  of  Fn  by  an  element  of  F. 


10 


CHAPTER  1  Vector  Spaces 


1.17  Definition  scalar  multiplication  in  Fn 

The  product  of  a  number  A  and  a  vector  in  Fn  is  computed  by  multiplying 
each  coordinate  of  the  vector  by  A: 


here  A  e  F  and  (x\ , . . . ,  xn)  e  F 


In  scalar  multiplication ,  we  multi¬ 
ply  together  a  scalar  and  a  vector, 
getting  a  vector.  You  may  be  famil¬ 
iar  with  the  dot  product  in  R2  or 
R3,  in  which  we  multiply  together 
two  vectors  and  get  a  scalar.  Gen¬ 
eralizations  of  the  dot  product  will 
become  important  when  we  study 
inner  products  in  Chapter  6. 


i 

(1/2)  A'/ 

(-3/2)  xj 

Scalar  multiplication. 


I, ,  AXn)\ 


Scalar  multiplication  has  a  nice  ge¬ 
ometric  interpretation  in  R2.  If  A  is  a 
positive  number  and  i  is  a  vector  in 
R2,  then  Ax  is  the  vector  that  points 
in  the  same  direction  as  x  and  whose 
length  is  A  times  the  length  of  x.  In 
other  words,  to  get  Ax,  we  shrink  or 
stretch  x  by  a  factor  of  A,  depending  on 
whether  A  <  1  or  A  >  1 . 

If  A  is  a  negative  number  and  x  is  a 
vector  in  R2,  then  Ax  is  the  vector  that 
points  in  the  direction  opposite  to  that 
of  x  and  whose  length  is  |  A  |  times  the 
length  of  x,  as  shown  here. 


Digression  on  Fields 

A  field  is  a  set  containing  at  least  two  distinct  elements  called  0  and  1,  along 
with  operations  of  addition  and  multiplication  satisfying  all  the  properties 
listed  in  1.3.  Thus  R  and  C  are  fields,  as  is  the  set  of  rational  numbers  along 
with  the  usual  operations  of  addition  and  multiplication.  Another  example  of 
a  field  is  the  set  {0, 1}  with  the  usual  operations  of  addition  and  multiplication 
except  that  1  +  1  is  defined  to  equal  0. 

In  this  book  we  will  not  need  to  deal  with  fields  other  than  R  and  C. 
However,  many  of  the  definitions,  theorems,  and  proofs  in  linear  algebra  that 
work  for  both  R  and  C  also  work  without  change  for  arbitrary  fields.  If  you 
prefer  to  do  so,  throughout  Chapters  1,  2,  and  3  you  can  think  of  F  as  denoting 
an  arbitrary  field  instead  of  R  or  C,  except  that  some  of  the  examples  and 
exercises  require  that  for  each  positive  integer  n  we  have  1  +  1  +  --  -+1^0. 


n  times 
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EXERCISES  l.A 

1  Suppose  a  and  b  are  real  numbers,  not  both  0.  Find  real  numbers  c  and 
d  such  that 

1  /{a  +  bi )  —  c  +  di. 

2  Show  that 

-i  +  V3 ; 

2 

is  a  cube  root  of  1  (meaning  that  its  cube  equals  1). 

3  Find  two  distinct  square  roots  of  i . 

4  Show  that  a  +  =  /3  +  a  for  all  a,  /3  E  C. 

5  Show  that  {a  +  /?)  +  A  =  a  +  (/3  +  A)  for  all  a,  X  E  C. 

6  Show  that  (a/?)!  =  a(/3A)  for  all  a,  /3,X  E  C. 

7  Show  that  for  every  a  e  C,  there  exists  a  unique  /3  G  C  such  that 

a  +  =  0. 

8  Show  that  for  every  a  e  C  with  o;  /0,  there  exists  a  unique  e  C  such 
thata/?  =  1. 

9  Show  that  X(a  +  =  Xa  +  X/3  for  all  A,  a,  /3  e  C. 

10  Find  x  e  R4  such  that 

(4,-3,  1,7)  + 2x  =  (5,  9, -6,  8). 

11  Explain  why  there  does  not  exist  X  e  C  such  that 

A (2  -  3 /,  5  +  4/,  -6  +  li)  =  (12  -  5 /,  7  +  22/,  -32  -  9/). 

12  Show  that  (x  +  y)  +  z  =  x  +  (y  +  z)  for  all  x,y,z  G  F72 . 

13  Show  that  (a/?)x  =  a(Z?x)  for  all  x  eFn  and  all  a,  b  e  F. 

14  Show  that  lx  =  x  for  all  x  E  Fw. 

15  Show  that  A(x  +  y)  =  Ax  +  Xy  for  all  X  E  F  and  all  x,  y  E  F+ 

16  Show  that  (< a  +  b)x  =  ax  +  bx  for  all  a,  b  E  F  and  all  x  E  Fn. 
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CHAPTER  1  Vector  Spaces 


Definition  of  Vector  Space 

The  motivation  for  the  definition  of  a  vector  space  comes  from  properties  of 
addition  and  scalar  multiplication  in  Fn :  Addition  is  commutative,  associative, 
and  has  an  identity.  Every  element  has  an  additive  inverse.  Scalar  multiplica¬ 
tion  is  associative.  Scalar  multiplication  by  1  acts  as  expected.  Addition  and 
scalar  multiplication  are  connected  by  distributive  properties. 

We  will  define  a  vector  space  to  be  a  set  V  with  an  addition  and  a  scalar 
multiplication  on  V  that  satisfy  the  properties  in  the  paragraph  above. 

1.18  Definition  addition ,  scalar  multiplication 

•  An  addition  on  a  set  V  is  a  function  that  assigns  an  element  u+v  G  V 
to  each  pair  of  elements  u,v  e  V. 

•  A  scalar  multiplication  on  a  set  V  is  a  function  that  assigns  an  ele¬ 
ment  Xv  g  V  to  each  X  G  F  and  each  v  G  V. 

Now  we  are  ready  to  give  the  formal  definition  of  a  vector  space. 

1.19  Definition  vector  space 

A  vector  space  is  a  set  V  along  with  an  addition  on  V  and  a  scalar  multi¬ 
plication  on  V  such  that  the  following  properties  hold: 

commutativity 

u  +  v  =  v  +  u  for  all  u,  v  G  V; 

associativity 

(u  +  v)  +  w  =  u  +  (v  +  w)  and  (ab)v  =  a(bv)  for  all  u,v,w  e  V 
and  all  a ,  b  e  F ; 

additive  identity 

there  exists  an  element  0  e  V  such  that  v  +  0  =  v  for  all  v  e  V; 

additive  inverse 

for  every  v  e  V,  there  exists  w  e  V  such  that  v  +  w  =  0; 

multiplicative  identity 

lv  =  v  for  all  v  G  V; 

distributive  properties 

a(u  +  v)  =  au  +  av  and  (< a  +  b)v  =  av  +  bv  for  all  a,  b  G  F  and 
all  ii,vG  V. 


l.B 
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The  following  geometric  language  sometimes  aids  our  intuition. 

1.20  Definition  vector,  point 

Elements  of  a  vector  space  are  called  vectors  or  points. 

The  scalar  multiplication  in  a  vector  space  depends  on  F.  Thus  when  we 
need  to  be  precise,  we  will  say  that  V  is  a  vector  space  over  F  instead  of 
saying  simply  that  V  is  a  vector  space.  For  example,  is  a  vector  space  over 
R,  and  Cn  is  a  vector  space  over  C. 

1 .21  Definition  real  vector  space ,  complex  vector  space 

•  A  vector  space  over  R  is  called  a  real  vector  space. 

•  A  vector  space  over  C  is  called  a  complex  vector  space. 

Usually  the  choice  of  F  is  either  obvious  from  the  context  or  irrelevant. 
Thus  we  often  assume  that  F  is  lurking  in  the  background  without  specifically 
mentioning  it. 

With  the  usual  operations  of  addition 
and  scalar  multiplication,  Fn  is  a  vector 
space  over  F,  as  you  should  verify.  The 
example  of  Fw  motivated  our  definition 
of  vector  space. 

1 .22  Example  F°°  is  defined  to  be  the  set  of  all  sequences  of  elements 
ofF: 

F°°  =  {(jci,jc2,  Xj  G  F  for  j  =  1,2,...}. 

Addition  and  scalar  multiplication  on  F°°  are  defined  as  expected: 

+  Oi,y2,...)  =  Oi  +  yi,X2  +  T2,...)> 

A(xi,X2,...)  =  (Xx\,Xx2,  . .  .)• 

With  these  definitions,  F°°  becomes  a  vector  space  over  F,  as  you  should 
verify.  The  additive  identity  in  this  vector  space  is  the  sequence  of  all  0’s. 


The  simplest  vector  space  contains 
only  one  point.  In  other  words,  {0} 
is  a  vector  space. 


Our  next  example  of  a  vector  space  involves  a  set  of  functions. 
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1.23  Notation 

•  If  S  is  a  set,  then  denotes  the  set  of  functions  from  S  to  F. 

•  For  f,ge  Fs ,  the  sum  f  +  g  e  Fs  is  the  function  defined  by 

(/  +  £)(*)  =  f(x)  +  g(x) 

for  all  x  e  S. 

•  For  X  e  F  and  /  e  F^,  the  product  Xf  e  F^  is  the  function 
defined  by 

(A  /)(*)  =  A/(x) 

for  all  x  e  S. 


As  an  example  of  the  notation  above,  if  S  is  the  interval  [0, 1]  and  F  =  R, 
then  r!0’1!  is  the  set  of  real-valued  functions  on  the  interval  [0,  1]. 

You  should  verify  all  three  bullet  points  in  the  next  example. 


1 .24  Example  F^  is  a  vector  space 

•  If  S  is  a  nonempty  set,  then  F^  (with  the  operations  of  addition  and 
scalar  multiplication  as  defined  above)  is  a  vector  space  over  F. 

•  The  additive  identity  of  F^  is  the  function  0  :  S  ->  F  defined  by 

0(jc)  =  0 

for  all  x  e  S. 

•  For  /  eF5,  the  additive  inverse  of  /  is  the  function  —  /  :  S  — >  F 
defined  by 

(-/)(*)  =  -/(*) 

for  all  x  e  S. 


Our  previous  examples  of  vector 
spaces,  Fn  and  F°°,  are  special  cases 
of  the  vector  space  F^  because  a  list  of 
length  n  of  numbers  in  F  can  be  thought 
of  as  a  function  from  {1, 2, to  F 
and  a  sequence  of  numbers  in  F  can  be 
thought  of  as  a  function  from  the  set  of 
positive  integers  to  F.  In  other  words,  we  can  think  of  Fw  as  fI1*2*— anc[ 
we  can  think  ofF°°  as  F^1’2’*”^. 


The  elements  of  the  vector  space 
R[0’i]  are  real-valued  functions  on 
[0, 1],  not  lists.  In  general,  a  vector 
space  is  an  abstract  entity  whose 
elements  might  be  lists,  functions, 
or  weird  objects. 
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Soon  we  will  see  further  examples  of  vector  spaces,  but  first  we  need  to 
develop  some  of  the  elementary  properties  of  vector  spaces. 

The  definition  of  a  vector  space  requires  that  it  have  an  additive  identity. 
The  result  below  states  that  this  identity  is  unique. 

i  .25  Unique  additive  identity 

A  vector  space  has  a  unique  additive  identity. 


Proof 

Then 


Suppose  0  and  O'  are  both  additive  identities  for  some  vector  space  V. 


O'  =  O'  +  0  =  0  +  O'  =  0, 


where  the  first  equality  holds  because  0  is  an  additive  identity,  the  second 
equality  comes  from  commutativity,  and  the  third  equality  holds  because  0' 
is  an  additive  identity.  Thus  0'  =  0,  proving  that  V  has  only  one  additive 
identity.  ■ 


Each  element  v  in  a  vector  space  has  an  additive  inverse,  an  element  w  in 
the  vector  space  such  that  v  +  w  =  0.  The  next  result  shows  that  each  element 
in  a  vector  space  has  only  one  additive  inverse. 


1.26  Unique  additive  inverse 

Every  element  in  a  vector  space  has  a  unique  additive  inverse. 


Proof  Suppose  V  is  a  vector  space.  Let  v  e  V.  Suppose  w  and  w'  are  additive 
inverses  of  v.  Then 

w  =  w  +  0  =  w+(v  +  w')  =  (w  +  v)  +  w'  =  0  +  w'  =  w'. 

Thus  w  =  w',  as  desired.  ■ 

Because  additive  inverses  are  unique,  the  following  notation  now  makes 
sense. 


i  .27  Notation  -v,  w  -  v 
Let  v,  w  G  V.  Then 

•  —v  denotes  the  additive  inverse  of  v; 

•  w  —  v  is  defined  to  be  w  +  (— v). 
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Almost  all  the  results  in  this  book  involve  some  vector  space.  To  avoid 
having  to  restate  frequently  that  V  is  a  vector  space,  we  now  make  the 
necessary  declaration  once  and  for  all: 

1.28  Notation  V 

For  the  rest  of  the  book,  V  denotes  a  vector  space  over  F. 

In  the  next  result,  0  denotes  a  scalar  (the  number  0  e  F)  on  the  left  side  of 
the  equation  and  a  vector  (the  additive  identity  of  V)  on  the  right  side  of  the 
equation. 

i  .29  The  number  0  times  a  vector 

Ov  =  0  for  every  v  E  V. 


Proof  For  v  e  V,  we  have 

Ov  =  (0  +  0)v  =  Ov  +  Ov. 

Adding  the  additive  inverse  of  Ov  to  both 
sides  of  the  equation  above  gives  0  = 
Ov,  as  desired.  ■ 

In  the  next  result,  0  denotes  the  addi¬ 
tive  identity  of  V.  Although  their  proofs 
are  similar,  1.29  and  1.30  are  not  identical.  More  precisely,  1.29  states  that 
the  product  of  the  scalar  0  and  any  vector  equals  the  vector  0,  whereas  1.30 
states  that  the  product  of  any  scalar  and  the  vector  0  equals  the  vector  0. 

i  .30  A  number  times  the  vector  0 
a0  =  0  for  every  a  e  F. 

Proof  For  a  e  F,  we  have 

aO  =  a(  0  +  0)  =  aO  +  aO. 

Adding  the  additive  inverse  of  a  0  to  both  sides  of  the  equation  above  gives 
0  =  aO,  as  desired.  ■ 

Now  we  show  that  if  an  element  of  V  is  multiplied  by  the  scalar  —1,  then 
the  result  is  the  additive  inverse  of  the  element  of  V. 


Note  that  1.29  asserts  something 
about  scalar  multiplication  and  the 
additive  identity  of  V.  The  only 
part  of  the  definition  of  a  vector 
space  that  connects  scalar  multi¬ 
plication  and  vector  addition  is  the 
distributive  property.  Thus  the  dis¬ 
tributive  property  must  be  used  in 
the  proof  of  1.29. 
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i  .31  The  number  -1  times  a  vector 
(— l)v  =  —v  for  every  v  e  V. 

Proof  For  v  e  V,  we  have 

V  +  (-l)v  =  lv  +  (-l)v  =  (1  +  (— l))v  =  Ov  =  0. 

This  equation  says  that  (— l)v,  when  added  to  v,  gives  0.  Thus  (— l)v  is  the 
additive  inverse  of  v,  as  desired.  ■ 


EXERCISES  l.B 


1  Prove  that  —  (— v)  =  v  for  every  v  e  V. 

2  Suppose  a  e  F,  v  e  V,  and  av  =  0.  Prove  that  a  =  0  or  v  =  0. 

3  Suppose  v,  w  e  V.  Explain  why  there  exists  a  unique  x  e  V  such  that 
v  +  3x  =  w. 


4  The  empty  set  is  not  a  vector  space.  The  empty  set  fails  to  satisfy  only 
one  of  the  requirements  listed  in  1.19.  Which  one? 

5  Show  that  in  the  definition  of  a  vector  space  (1.19),  the  additive  inverse 
condition  can  be  replaced  with  the  condition  that 

Ov  =  0  for  all  v  e  V. 


Here  the  0  on  the  left  side  is  the  number  0,  and  the  0  on  the  right  side  is 
the  additive  identity  of  V.  (The  phrase  “a  condition  can  be  replaced”  in  a 
definition  means  that  the  collection  of  objects  satisfying  the  definition  is 
unchanged  if  the  original  condition  is  replaced  with  the  new  condition.) 


Let  oo  and  — oo  denote  two  distinct  objects,  neither  of  which  is  in  R. 
Define  an  addition  and  scalar  multiplication  on  R  U  {oo}  U  {— oo}  as  you 
could  guess  from  the  notation.  Specifically,  the  sum  and  product  of  two 
real  numbers  is  as  usual,  and  for  t  e  R  define 


too  = 


— oo  if  t  <  0, 
0  if  t  =  0, 
oo  if  t  >  0, 

t  oo  =  oo  1  —  oo, 


t(—o o)  =  < 


oo  if  t  <  0, 

0  if  t  =  0, 

— oo  if  t  >  0, 

t  +  (— oo)  =  (— oo)  +  t  —  — oo, 


00  +  00  =  00,  (— oo)  +  (— oo)  =  — oo,  oo  +  (— oo)  =  0. 

Is  R  U  {oo}  U  {— oo}  a  vector  space  over  R?  Explain. 
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l.C 


Subspaces 


By  considering  subspaces,  we  can  greatly  expand  our  examples  of  vector 
spaces. 


1.32  Definition  subspace 

A  subset  U  of  V  is  called  a  subspace  of  V  if  U  is  also  a  vector  space 
(using  the  same  addition  and  scalar  multiplication  as  on  V). 


1 .33  Example  {(xi,  X2,  0)  :  x\,  X2  e  F}  is  a  subspace  of  F3. 


Some  mathematicians  use  the  term 
linear  subspace,  which  means  the 
same  as  subspace. 


The  next  result  gives  the  easiest  way 
to  check  whether  a  subset  of  a  vector 
space  is  a  subspace. 


i  .34  Conditions  for  a  subspace 

A  subset  U  of  V  is  a  subspace  of  V  if  and  only  if  U  satisfies  the  following 
three  conditions: 

additive  identity 

0  eU 

closed  under  addition 

u,w  e  U  implies  u  +  w  e  U\ 

closed  under  scalar  multiplication 

a  G  F  and  u  e  U  implies  au  e  U. 

Proof  If  U  is  a  subspace  of  V,  then  U 
satisfies  the  three  conditions  above  by 
the  definition  of  vector  space. 

Conversely,  suppose  U  satisfies  the 
three  conditions  above.  The  first  con¬ 
dition  above  ensures  that  the  additive 
identity  of  V  is  in  U. 

The  second  condition  above  ensures 
that  addition  makes  sense  on  U.  The 
third  condition  ensures  that  scalar  mul¬ 
tiplication  makes  sense  on  U. 


The  additive  identity  condition 
above  could  be  replaced  with  the 
condition  that  U  is  nonempty  ( then 
taking  u  e  U,  multiplying  it  by  0, 
and  using  the  condition  that  U  is 
closed  under  scalar  multiplication 
would  imply  that  0  e  U).  However, 
if  U  is  indeed  a  subspace  of  V, 
then  the  easiest  way  to  show  that  U 
is  nonempty  is  to  show  that  0  e  U. 
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If  u  G  U,  then  —  u  [which  equals  (— 1  )u  by  1.31]  is  also  in  U  by  the  third 
condition  above.  Hence  every  element  of  U  has  an  additive  inverse  in  U. 

The  other  parts  of  the  definition  of  a  vector  space,  such  as  associativity 
and  commutativity,  are  automatically  satisfied  for  U  because  they  hold  on  the 
larger  space  V.  Thus  U  is  a  vector  space  and  hence  is  a  subspace  of  V.  m 

The  three  conditions  in  the  result  above  usually  enable  us  to  determine 
quickly  whether  a  given  subset  of  V  is  a  subspace  of  V.  You  should  verify  all 
the  assertions  in  the  next  example. 

1.35  Example  subspaces 

(a)  If  b  G  F,  then 

{(xx,x2, X3, X4)  G  F4  :  13  =  5x4  +  b} 
is  a  subspace  of  F4  if  and  only  if  b  =  0. 

(b)  The  set  of  continuous  real-valued  functions  on  the  interval  [0, 1]  is  a 
subspace  of  R^0,1^ 

(c)  The  set  of  differentiable  real- valued  functions  on  R  is  a  subspace  of  Rr  . 

(d)  The  set  of  differentiable  real- valued  functions  /  on  the  interval  (0,  3) 
such  that  /'( 2)  —  b  is  a  subspace  of  r(°>3)  if  and  only  if  b  =  0. 

(e)  The  set  of  all  sequences  of  complex  numbers  with  limit  0  is  a  subspace 
of  C°°. 


Clearly  {0}  is  the  smallest  sub¬ 
space  of  V  and  V  itself  is  the 
largest  subspace  of  V.  The  empty 
set  is  not  a  subspace  of  V  because 
a  subspace  must  be  a  vector  space 
and  hence  must  contain  at  least 
one  element,  namely,  an  additive 
identity. 


Verifying  some  of  the  items  above 
shows  the  linear  structure  underlying 
parts  of  calculus.  For  example,  the  sec¬ 
ond  item  above  requires  the  result  that 
the  sum  of  two  continuous  functions  is 
continuous.  As  another  example,  the 
fourth  item  above  requires  the  result 
that  for  a  constant  c,  the  derivative  of 
c f  equals  c  times  the  derivative  of  /. 

The  subspaces  of  R2  are  precisely  {0},  R2,  and  all  lines  in  R2  through  the 
origin.  The  subspaces  of  R3  are  precisely  {0},  R3,  all  lines  in  R3  through  the 
origin,  and  all  planes  in  R3  through  the  origin.  To  prove  that  all  these  objects 
are  indeed  subspaces  is  easy — the  hard  part  is  to  show  that  they  are  the  only 
subspaces  of  R2  and  R3.  That  task  will  be  easier  after  we  introduce  some 
additional  tools  in  the  next  chapter. 
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Sums  of  Subspaces 

When  dealing  with  vector  spaces,  we 
are  usually  interested  only  in  subspaces, 
as  opposed  to  arbitrary  subsets.  The 
notion  of  the  sum  of  subspaces  will  be 
useful. 


The  union  of  subspaces  is  rarely  a 
subspace  (see  Exercise  12),  which 
is  why  we  usually  work  with  sums 
rather  than  unions. 


1.36  Definition  sum  of  subsets 

Suppose  U i , . . . ,  Um  are  subsets  of  V.  The  sum  of  U\ , . . . ,  Um,  denoted 
U\  +  •  •  •  +  Um,  is  the  set  of  all  possible  sums  of  elements  of  U\ , . . . ,  Um. 
More  precisely, 

U\  +  •  •  •  +  Um  =  {u\  +  •••  +  ii/n  ;  u\  g  U\ , . . . , um  g  Um } . 

Let’s  look  at  some  examples  of  sums  of  subspaces. 

1 .37  Example  Suppose  U  is  the  set  of  all  elements  of  F3  whose  second 
and  third  coordinates  equal  0,  and  W  is  the  set  of  all  elements  of  F3  whose 
first  and  third  coordinates  equal  0: 

U  =  {(x,  0,0)eF3:xeF}  and  W  =  {(0,  y,  0)  e  F3  :  y  e  F}. 
Then 

U  +  W  =  {(x,  y,  0)  :  x,  y  e  F}, 

as  you  should  verify. 


1.38  Example  Suppose  that  U  =  {(x,x,y,y)  gF4:x,jgF}  and 
W  =  {(x,  x,  x,  y)  G  F4  :  x,  y  G  F}.  Then 

U  +  W  =  {(x,  x,  y,  z)  g  F4  :  x,  y,z  g  F}, 

as  you  should  verify. 

The  next  result  states  that  the  sum  of  subspaces  is  a  subspace,  and  is  in 
fact  the  smallest  subspace  containing  all  the  summands. 

1.39  Sum  of  subspaces  is  the  smallest  containing  subspace 

Suppose  U i, ,  Um  are  subspaces  of  V.  Then  U\  +  •  •  •  +  Um  is  the 
smallest  subspace  of  V  containing  U 1, . . . ,  Um. 
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Proof  It  is  easy  to  see  that  0  e  U\  +  •  •  •  +  Um  and  that  U \  +  •  •  •  +  Um 
is  closed  under  addition  and  scalar  multiplication.  Thus  1.34  implies  that 


U i  +  •  •  •  +  Um  is  a  subspace  of  V. 

Clearly  U\, . . .  ,Um  are  all  con¬ 
tained  in  U\  +  •  •  •  +  Um  (to  see  this, 
consider  sums  u\  +  •••  +  um  where 
all  except  one  of  the  u’ s  are  0).  Con¬ 
versely,  every  subspace  of  V  contain¬ 
ing  U i , . . . ,  Um  contains  U\-\ - b  Um 

(because  subspaces  must  contain  all  fi¬ 
nite  sums  of  their  elements).  Thus 

U\  H - b  Um  is  the  smallest  subspace 

of  V  containing  U\ , . . . ,  Um.  m 


Sums  of  subspaces  in  the  theory 
of  vector  spaces  are  analogous 
to  unions  of  subsets  in  set  theory. 
Given  two  subspaces  of  a  vector 
space,  the  smallest  subspace  con¬ 
taining  them  is  their  sum.  Analo¬ 
gously,  given  two  subsets  of  a  set, 
the  smallest  subset  containing  them 
is  their  union. 


Direct  Sums 

Suppose  U\, ... ,  Um  are  subspaces  of  V.  Every  element  of  U\  +  •  •  •  +  Um 
can  be  written  in  the  form 


u  i  +  •  •  •  +  um , 

where  each  uj  is  in  U j .  We  will  be  especially  interested  in  cases  where  each 
vector  in  U\  +  •  •  •  +  Um  can  be  represented  in  the  form  above  in  only  one 
way.  This  situation  is  so  important  that  we  give  it  a  special  name:  direct  sum. 

1.40  Definition  direct  sum 
Suppose  U i , . . . ,  Um  are  subspaces  of  V. 

•  The  sum  U\  +  •  •  •  +  Um  is  called  a  direct  sum  if  each  element 
of  U\  +  •••  +  Um  can  be  written  in  only  one  way  as  a  sum 
u\  +  •  •  •  +  um ,  where  each  Uj  is  in  Uj . 

•  If  U\  +  •  •  •  +  Um  is  a  direct  sum,  then  U\  ©  •  •  •  ©  Um  denotes 
U\  +  •  •  •  +  Um,  with  the  ©  notation  serving  as  an  indication  that 
this  is  a  direct  sum. 


1 .41  Example  Suppose  U  is  the  subspace  of  F3  of  those  vectors  whose 
last  coordinate  equals  0,  and  W  is  the  subspace  of  F3  of  those  vectors  whose 
first  two  coordinates  equal  0: 

U  =  {(jc.y.O)  e  F3  :  X,  y  e  F}  and  W  =  {(0, 0,  z)  e  F3  :  z  e  F}. 
Then  F3  =  U  ©  W,  as  you  should  verify. 


«  In 
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1 .42  Example  Suppose  U j  is  the  subspace  of  Fn  of  those  vectors  whose 
coordinates  are  all  0,  except  possibly  in  the  jth  slot  (thus,  for  example, 
U2  =  {(0,  x,  0, . . . ,  0)  e  F72  :  x  e  F}).  Then 

Fn  =  Ui  ©•••©£/„, 

as  you  should  verify. 

Sometimes  nonexamples  add  to  our  understanding  as  much  as  examples. 

1.43  Example  Let 

Ui  =  {(x,  y,  0)  €  F3  :  x,  y  e  F}, 

(/2  =  {(0,0,z)6F3:zeF}, 

C/3  =  {(0,j,j)eF3:jeF}. 

Show  that  U\  +  U2  +  U3  is  not  a  direct  sum. 

Solution  Clearly  F3  =  U\  +  U2  +  C/3,  because  every  vector  (x,  y,  z)  eF3 
can  be  written  as 

(x,  y,  z)  =  (x,  y,  0)  +  (0,  0,  z)  +  (0,  0,  0), 

where  the  first  vector  on  the  right  side  is  in  U\,  the  second  vector  is  in  U2, 
and  the  third  vector  is  in  U3. 

However,  F3  does  not  equal  the  direct  sum  of  U\  M2M3,  because  the 
vector  (0, 0, 0)  can  be  written  in  two  different  ways  as  a  sum  u\  +  U2  +  U3 , 
with  each  uj  in  U j.  Specifically,  we  have 

(0,  0,  0)  =  (0, 1,  0)  +  (0, 0, 1)  +  (0,  -1,  -1) 

and,  of  course, 

(0,0,0)  =  (0,0,0) +  (0,0,0) +  (0,0,0), 

where  the  first  vector  on  the  right  side  of  each  equation  above  is  in  U\,  the 
second  vector  is  in  U2,  and  the  third  vector  is  in  U3. 

The  definition  of  direct  sum  requires 
that  every  vector  in  the  sum  have  a 
unique  representation  as  an  appropriate 
sum.  The  next  result  shows  that  when 
deciding  whether  a  sum  of  subspaces 
is  a  direct  sum,  we  need  only  consider 
whether  0  can  be  uniquely  written  as  an 
appropriate  sum. 


The  symbol  ©,  which  is  a  plus 
sign  inside  a  circle,  serves  as  a  re¬ 
minder  that  we  are  dealing  with  a 
special  type  of  sum  of  subspaces — 
each  element  in  the  direct  sum  can 
be  represented  only  one  way  as  a 
sum  of  elements  from  the  specified 
subspaces. 
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i  .44  Condition  for  a  direct  sum 

Suppose  U\ , . . . ,  Um  are  subspaces  of  V.  Then  U\  +  •  •  •  +  Um  is  a  direct 
sum  if  and  only  if  the  only  way  to  write  0  as  a  sum  u\  +  •  •  •  +  um,  where 
each  Uj  is  in  Uj ,  is  by  taking  each  Uj  equal  to  0. 

Proof  First  suppose  U\  +  •  •  •  +  Um  is  a  direct  sum.  Then  the  definition  of 

direct  sum  implies  that  the  only  way  to  write  0  as  a  sum  u\  H - b  um,  where 

each  Uj  is  in  Uj ,  is  by  taking  each  Uj  equal  to  0. 

Now  suppose  that  the  only  way  to  write  0  as  a  sum  u\  +  •  •  •  +  um,  where 

each  Uj  is  in  Uj ,  is  by  taking  each  Uj  equal  to  0.  To  show  that  U\  H - b  Um 

is  a  direct  sum,  let  v  G  U\  +  •  •  •  +  Um.  We  can  write 

v  =  U\  +  •  •  •  +  um 

for  some  u\  G  U\, ...  ,um  G  Um.  To  show  that  this  representation  is  unique, 
suppose  we  also  have 

v  =  vi  H - b  vm, 

where  v\  G  U\, ...  ,vm  e  Um.  Subtracting  these  two  equations,  we  have 

0  =  (u  i  —  vi)  +  •  •  •  +  ( u m  —  vm). 

Because  u\  —  v\  G  U\, . . . ,  um  —  vm  G  Um ,  the  equation  above  implies  that 
each  Uj  —  Vj  equals  0.  Thus  u\  —  vi, . . . ,  um  —  vm ,  as  desired.  ■ 

The  next  result  gives  a  simple  condition  for  testing  which  pairs  of  sub¬ 
spaces  give  a  direct  sum. 

i  .45  Direct  sum  of  two  subspaces 

Suppose  U  and  W  are  subspaces  of  V.  Then  U  +  W  is  a  direct  sum  if 
and  only  if  U  Pi  W  =  {0}. 

Proof  First  suppose  that  U  +  W  is  a  direct  sum.  If  v  G  U  Pi  W,  then 
0  =  v  +  (— v),  where  v  G  U  and  —  v  G  W.  By  the  unique  representation 
of  0  as  the  sum  of  a  vector  in  U  and  a  vector  in  W,  we  have  v  =  0.  Thus 
U  P  W  =  {0},  completing  the  proof  in  one  direction. 

To  prove  the  other  direction,  now  suppose  U  P  W  =  {0}.  To  prove  that 
U  +  W  is  a  direct  sum,  suppose  u  G  U,  w  G  W,  and 

0  =  u  +  w. 

To  complete  the  proof,  we  need  only  show  that  u  =  w  =  0  (by  1.44).  The 
equation  above  implies  that  u  =  — w  G  W.  Thus  u  G  U  P  W.  Hence  u  —  0, 
which  by  the  equation  above  implies  that  w  =  0,  completing  the  proof.  ■ 
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Sums  of  subspaces  are  analogous 
to  unions  of  subsets.  Similarly,  di¬ 
rect  sums  of  sub  spaces  are  analo¬ 
gous  to  disjoint  unions  of  subsets. 
No  two  subspaces  of  a  vector  space 
can  be  disjoint,  because  both  con¬ 
tain  0.  So  disjointness  is  replaced, 
at  least  in  the  case  of  two  sub¬ 
spaces,  with  the  requirement  that 
the  intersection  equals  {0}. 


The  result  above  deals  only  with 
the  case  of  two  subspaces.  When  ask¬ 
ing  about  a  possible  direct  sum  with 
more  than  two  subspaces,  it  is  not 
enough  to  test  that  each  pair  of  the 
subspaces  intersect  only  at  0.  To  see 
this,  consider  Example  1.43.  In  that 
nonexample  of  a  direct  sum,  we  have 

Ui  n  u2  =  Ui  n  u3  =  u2  n  u3  =  {0}. 


EXERCISES  l.C 

1  For  each  of  the  following  subsets  of  F3 4 5 6 7 8,  determine  whether  it  is  a  sub¬ 
space  ofF3: 

(a)  {(xi,  X2,  x$)  e  F3  :  x\  +  2x2  +  3x3  =  0}; 

(b)  {(xi, X2, X3)  G  F3  :  xi  +  2x2  +  3x3  =  4}; 

(c)  {(xi ,  X2,  X3)  G  F3  :  X1X2X3  =  0}; 

(d)  {(xi ,  X2,  X3)  G  F3  :  xi  =  5x3}. 

2  Verify  all  the  assertions  in  Example  1.35. 

3  Show  that  the  set  of  differentiable  real- valued  functions  /  on  the  interval 
(—4, 4)  such  that  /'(—  1)  =  3/(2)  is  a  subspace  of  R*-4,4). 

4  Suppose  be  R.  Show  that  the  set  of  continuous  real- valued  functions  / 
on  the  interval  [0, 1]  such  that  /  —  b  is  a  subspace  of  R^0,  ^  if  and 
only  if  b  =  0. 

5  Is  R1 2  a  subspace  of  the  complex  vector  space  C2? 

6  (a)  Is  {(a,  b ,  c)  e  R3  :  a3  =  b3}  a  subspace  of  R3? 

(b)  Is  {(a,  b ,  c)  G  C3  :  a3  =  b3}  a  subspace  of  C3? 

7  Give  an  example  of  a  nonempty  subset  U  of  R2  such  that  U  is  closed 
under  addition  and  under  taking  additive  inverses  (meaning  —ueU 
whenever  u  e  U),  but  U  is  not  a  subspace  of  R2. 

8  Give  an  example  of  a  nonempty  subset  U  of  R2  such  that  U  is  closed 
under  scalar  multiplication,  but  U  is  not  a  subspace  of  R2. 
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10 

11 

12 

13 


14 

15 

16 

17 


18 
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A  function  / :  R  ->  R  is  called  periodic  if  there  exists  a  positive  number 
p  such  that  /(x)  =  f(x  +  p )  for  all  x  e  R.  Is  the  set  of  periodic 
functions  from  R  to  R  a  subspace  of  Rr?  Explain. 

Suppose  U\  and  U2  are  subspaces  of  V.  Prove  that  the  intersection 
U\  D  U2  is  a  subspace  of  V. 

Prove  that  the  intersection  of  every  collection  of  subspaces  of  V  is  a 
subspace  of  V. 

Prove  that  the  union  of  two  subspaces  of  V  is  a  subspace  of  V  if  and 
only  if  one  of  the  subspaces  is  contained  in  the  other. 

Prove  that  the  union  of  three  subspaces  of  V  is  a  subspace  of  V  if  and 
only  if  one  of  the  subspaces  contains  the  other  two. 

[This  exercise  is  surprisingly  harder  than  the  previous  exercise,  possibly 
because  this  exercise  is  not  true  if  we  replace  F  with  afield  containing 
only  two  elements .] 

Verify  the  assertion  in  Example  1.38. 

Suppose  U  is  a  subspace  of  V.  What  is  U  +  U1 

Is  the  operation  of  addition  on  the  subspaces  of  V  commutative?  In  other 
words,  if  U  and  W  are  subspaces  of  V,  is  U  +  W  —  W  +  U1 

Is  the  operation  of  addition  on  the  subspaces  of  V  associative?  In  other 
words,  if  U\ ,  U2,  C/3  are  subspaces  of  V,  is 

(C/i  +  U2)  +  U3  =  U1  +  (U2  +  c/3)? 

Does  the  operation  of  addition  on  the  subspaces  of  V  have  an  additive 
identity?  Which  subspaces  have  additive  inverses? 

Prove  or  give  a  counterexample:  ifU\,U2,W  are  subspaces  of  V  such 
that 

U\  +  W  =  u2  +  w, 

then  U\  =  U2. 

Suppose 

U  =  {(x,  x,  y,  y)  G  F4  :  x,  y  e  F}. 

Find  a  subspace  W  of  F4  such  that  F4  =  U  ©  W. 
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21  Suppose 

U  =  {(x,  y,  x  +  y,  x  —  y,  2x)  G  F5  :  x,  y  G  F}. 

Find  a  subspace  W  of  F5  such  that  F5  =  U  ®  W. 

22  Suppose 

U  =  {(x,  y,  x  +  y,  x  —  y,  2x)  G  F5  :  x,  y  G  F}. 

Find  three  subspaces  W\ ,  IF2,  IF3  of  F5,  none  of  which  equals  {0},  such 
that  F5  =  U  ©  W\  ©  W2  ®  W3 . 

23  Prove  or  give  a  counterexample:  if  U\,  U2,  W  are  subspaces  of  V  such 
that 

V  =  Ui  ©  W  and  V  =  U2  ©  W, 

then  U\  =  U2. 

24  A  function  / :  R  — >  R  is  called  even  if 

/(-*)  =  fix ) 

for  all  x  G  R.  A  function  / :  R  R  is  called  odd  if 

fi-x)  =  -fix) 

for  all  x  G  R.  Let  UG  denote  the  set  of  real- valued  even  functions  on  R 
and  let  U0  denote  the  set  of  real- valued  odd  functions  on  R.  Show  that 

Rr  =  ue  ©  u0. 


CHAPTER 


American  mathematician  Paul 
Halmos  ( 1916-2006 ),  who  in  1942 
published  the  first  modern  linear 
algebra  book.  The  title  of 
Halmos ’s  book  was  the  same  as  the 
title  of  this  chapter. 


Finite-Dimensional 
Vector  Spaces 


Let’s  review  our  standing  assumptions: 

2.1  Notation  F,  V 

•  F  denotes  R  or  C. 

•  V  denotes  a  vector  space  over  F. 

In  the  last  chapter  we  learned  about  vector  spaces.  Linear  algebra  focuses 
not  on  arbitrary  vector  spaces,  but  on  finite-dimensional  vector  spaces,  which 
we  introduce  in  this  chapter. 

LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  span 

■  linear  independence 

■  bases 

■  dimension 
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Span  and  Linear  Independence 

We  have  been  writing  lists  of  numbers  surrounded  by  parentheses,  and  we  will 
continue  to  do  so  for  elements  of  Fn;  for  example,  (2,  —7,  8)  e  F3.  However, 
now  we  need  to  consider  lists  of  vectors  (which  may  be  elements  of  Fw  or  of 
other  vector  spaces).  To  avoid  confusion,  we  will  usually  write  lists  of  vectors 
without  surrounding  parentheses.  For  example,  (4, 1, 6),  (9,  5, 7)  is  a  list  of 
length  2  of  vectors  in  R3 . 

2.2  Notation  list  of  vectors 

We  will  usually  write  lists  of  vectors  without  surrounding  parentheses. 


2.A 


Linear  Combinations  and  Span 

Adding  up  scalar  multiples  of  vectors  in  a  list  gives  what  is  called  a  linear 
combination  of  the  list.  Here  is  the  formal  definition: 

2.3  Definition  linear  combination 

A  linear  combination  of  a  list  v\ , . . . ,  vm  of  vectors  in  V  is  a  vector  of 
the  form 

a\v\  +  •  •  •  +  amvm, 

where  a\, ...  ,am  e  F. 


2.4  Example  inF3, 

•  (17,  —4,  2)  is  a  linear  combination  of  (2,  1,  —3),  (1,  —2,  4)  because 

(17,  -4,2)  =  6(2, 1,-3) +  5(1, -2,  4). 

•  (17,  —4,  5)  is  not  a  linear  combination  of  (2,  1,  —3),  (1,  —2,  4)  because 
there  do  not  exist  numbers  a\ ,  <22  c  F  such  that 

(17,  -4,  5)  =  01  (2, 1,  -3)  +  a2(l ,  -2, 4). 

In  other  words,  the  system  of  equations 

17  =  2  ci\  +  a2 
—4  —  ci\  —  la2 
5  =  —3ai  +  Aa2 

has  no  solutions  (as  you  should  verify). 
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2.5  Definition  span 

The  set  of  all  linear  combinations  of  a  list  of  vectors  v\ , . . . ,  vm  in  V  is 
called  the  span  of  vi , . . . ,  vm,  denoted  span(vi , . . . ,  vm).  In  other  words, 

span(vi, . . . ,  vm)  =  {a\v\  H - V  a  m  Vm  •  d\,  •  •  •  ,  Clm  G  F}. 

The  span  of  the  empty  list  ( )  is  defined  to  be  {0}. 


2.6  Example  The  previous  example  shows  that  in  F3, 

•  (17, -4, 2)  €  span((2, 1,  —3),  (1,  —2,  4)); 

•  (17,  -4,  5)  j  span ((2,  1,-3),  (1,-2,  4)), 

Some  mathematicians  use  the  term  linear  span ,  which  means  the  same  as 
span. 

2.7  Span  is  the  smallest  containing  subspace 

The  span  of  a  list  of  vectors  in  V  is  the  smallest  subspace  of  V  containing 
all  the  vectors  in  the  list. 

Proof  Suppose  v\ , . . . ,  vm  is  a  list  of  vectors  in  V. 

First  we  show  that  span(vi, . . . ,  vm)  is  a  subspace  of  V.  The  additive 
identity  is  in  span(vi , . . . ,  vm ),  because 


0  —  Ovi  +  •  •  •  +  0vm. 

Also,  span(vi , . . . ,  vm)  is  closed  under  addition,  because 

(cl\ v\  +•  •  v\  +•  •  =  (ci\  +ci)vi  +•  •  *  +  (^ 

m  )  Vm  • 

Furthermore,  span(vi, . . . ,  vm)  is  closed  under  scalar  multiplication,  because 

X{a\v\  +  •  •  •  +  amvm)  =  Xa\v\  +  •  •  •  +  Xamvm. 

Thus  span(vi , . . . ,  vm)  is  a  subspace  of  V  (by  1.34). 

Each  Vj  is  a  linear  combination  of  v\ , . . . ,  vm  (to  show  this,  set  aj  =  1 
and  let  the  other  a ’s  in  2.3  equal  0).  Thus  span(vi , . . . ,  vm)  contains  each  Vj . 
Conversely,  because  subspaces  are  closed  under  scalar  multiplication  and 
addition,  every  subspace  of  V  containing  each  vj  contains  span(vi, . . . ,  vm). 
Thus  span(vi , . . . ,  vm)  is  the  smallest  subspace  of  V  containing  all  the  vectors 

Tl ,  •  • •  , Vm .  ■ 
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2.8  Definition  spans 

If  span(vi , . . . ,  vm)  equals  V,  we  say  that  v\ , . . . ,  vm  spans  V. 


2.9  Example  Suppose  n  is  a  positive  integer.  Show  that 


(1,0 . 0),  (0, 1, 0 . 0) . (0 . 0, 1) 


spans  Fw.  Here  the  y th  vector  in  the  list  above  is  the  n -tuple  with  1  in  the  j th 
slot  and  0  in  all  other  slots. 


Solution  Suppose  (x\ 

9***9  J'l  )  eFn.  Then 

{x\,  ...  ,Xn)  =  X\  (1, 0,  .  .  .  ,  0)  +  X2(0,  1, 0, . . . ,  0)  +  •  •  •  +  xn  (0,  .  .  .  ,  0,  1). 

Thus  (x\,...,  xn )  G  span((l,  0, . . . ,  0),  (0, 1, 0, . . . ,  0), . . . ,  (0, . . . ,  0, 1)),  as 
desired. 


Now  we  can  make  one  of  the  key  definitions  in  linear  algebra. 

2.10  Definition  finite-dimensional  vector  space 

A  vector  space  is  called  finite-dimensional  if  some  list  of  vectors  in  it 
spans  the  space. 


Example  2.9  above  shows  that  Fw 
is  a  finite-dimensional  vector  space  for 
every  positive  integer  n . 

The  definition  of  a  polynomial  is  no  doubt  already  familiar  to  you. 


Recall  that  by  definition  every  list 
has  finite  length. 


2.11  Definition  polynomial,  F) 

•  A  function  p :  F  ->  F  is  called  a  polynomial  with  coefficients  in  F 
if  there  exist  a$, ...  ,am  G  F  such  that 

p(z)  =  ao  +  a\z  +  a2z 2  H - b  amzm 

for  all  z  G  F. 

•  V(F)  is  the  set  of  all  polynomials  with  coefficients  in  F. 
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With  the  usual  operations  of  addition  and  scalar  multiplication,  V(F)  is  a 
vector  space  over  F,  as  you  should  verify.  In  other  words,  'P(F)  is  a  subspace 
of  Ff,  the  vector  space  of  functions  from  F  to  F. 

If  a  polynomial  (thought  of  as  a  function  from  F  to  F)  is  represented  by 
two  sets  of  coefficients,  then  subtracting  one  representation  of  the  polynomial 
from  the  other  produces  a  polynomial  that  is  identically  zero  as  a  function 
on  F  and  hence  has  all  zero  coefficients  (if  you  are  unfamiliar  with  this  fact, 
just  believe  it  for  now;  we  will  prove  it  later — see  4.7).  Conclusion:  the 
coefficients  of  a  polynomial  are  uniquely  determined  by  the  polynomial.  Thus 
the  next  definition  uniquely  defines  the  degree  of  a  polynomial. 

2.12  Definition  degree  of  a  polynomial ,  deg  p 

•  A  polynomial  p  e  V(F)  is  said  to  have  degree  m  if  there  exist 
scalars  ao,a\ , . . . ,  am  gF  with  am  ^  0  such  that 

p(z)  =  a0  +  a\z  H - b  amzm 

for  all  z  G  F.  If  p  has  degree  m ,  we  write  deg  p  =  m. 

•  The  polynomial  that  is  identically  0  is  said  to  have  degree  —  oo. 

In  the  next  definition,  we  use  the  convention  that  —  oo  <  m,  which  means 
that  the  polynomial  0  is  in  Vm(F). 

2.13  Definition  Vm(F) 

For  m  a  nonnegative  integer,  Vm{¥)  denotes  the  set  of  all  polynomials 

with  coefficients  in  F  and  degree  at  most  m. 

To  verify  the  next  example,  note  that  Vm(F)  =  span(l,  z, . . . ,  zm );  here 
we  are  slightly  abusing  notation  by  letting  zk  denote  a  function. 

2.1 4  Example  Vm  (F)  is  a  finite-dimensional  vector  space  for  each  non¬ 
negative  integer  m. 


2.1 5  Definition  infinite-dimensional  vector  space 
A  vector  space  is  called  infinite-dimensional  if  it  is  not  finite-dimensional. 
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2.16  Example  Show  that  V(F)  is  infinite-dimensional. 

Solution  Consider  any  list  of  elements  of  V(F).  Let  m  denote  the  highest 
degree  of  the  polynomials  in  this  list.  Then  every  polynomial  in  the  span  of 
this  list  has  degree  at  most  m.  Thus  zm+1  is  not  in  the  span  of  our  list.  Hence 
no  list  spans  V(F).  Thus  V(F)  is  infinite-dimensional. 


Linear  Independence 

Suppose  v\ , . . . ,  vm  e  V  and  v  e  span(vi , . . . ,  vm).  By  the  definition  of  span, 
there  exist  a  i , . . . ,  am  e  F  such  that 


v  —  a  i  vi  +  •  •  •  +  amvm. 


Consider  the  question  of  whether  the  choice  of  scalars  in  the  equation  above 
is  unique.  Suppose  c\ , . . . ,  cm  is  another  set  of  scalars  such  that 

V  =  C\V\  H - b  Cmvm. 


Subtracting  the  last  two  equations,  we  have 

0  =  ( a\  —  cx)vi  +  •  •  •  +  ( am  —  cm)vm. 

Thus  we  have  written  0  as  a  linear  combination  of  (vi , . . . ,  vm).  If  the  only 
way  to  do  this  is  the  obvious  way  (using  0  for  all  scalars),  then  each  aj  —  Cj 
equals  0,  which  means  that  each  aj  equals  cj  (and  thus  the  choice  of  scalars 
was  indeed  unique).  This  situation  is  so  important  that  we  give  it  a  special 
name — linear  independence — which  we  now  define. 

2.17  Definition  linearly  independent 

•  A  list  v\ , . . . ,  vm  of  vectors  in  V  is  called  linearly  independent  if 
the  only  choice  ofa\,...,am  e  F  that  makes  a\ v\  +  •  •  •  +  amvm 
equal  0  is  a\  =  •  •  •  =  am  =0. 

•  The  empty  list  ( )  is  also  declared  to  be  linearly  independent. 

The  reasoning  above  shows  that  v\ , . . . ,  vm  is  linearly  independent  if  and 
only  if  each  vector  in  span(vi , ,vm)  has  only  one  representation  as  a  linear 
combination  of  v\ , . . . ,  vm . 
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2.18  Example  linearly  independent  lists 

(a)  A  list  v  of  one  vector  v  e  V  is  linearly  independent  if  and  only  if  v  ^  0. 

(b)  A  list  of  two  vectors  in  V  is  linearly  independent  if  and  only  if  neither 
vector  is  a  scalar  multiple  of  the  other. 

(c)  (1,  0,  0,  0),  (0,  1,  0,  0),  (0,  0,  1,  0)  is  linearly  independent  in  F4. 

(d)  The  list  1,  z, . . . ,  zm  is  linearly  independent  in  V(F)  for  each  nonnega¬ 
tive  integer  m. 


If  some  vectors  are  removed  from  a  linearly  independent  list,  the  remaining 
list  is  also  linearly  independent,  as  you  should  verify. 

2.19  Definition  linearly  dependent 

•  A  list  of  vectors  in  V  is  called  linearly  dependent  if  it  is  not  linearly 
independent. 

•  In  other  words,  a  list  v\, . . . ,  vm  of  vectors  in  V  is  linearly  de¬ 
pendent  if  there  exist  e  F,  not  all  0,  such  that 

a\v\  +  •  •  •  +  cim  vm  =  0. 


2.20  Example  linearly  dependent  lists 

•  (2,  3,  1),  (1,  —1, 2),  (7,  3,  8)  is  linearly  dependent  in  F3  because 

2(2,  3, 1)  +  3(1,  -1,2)  +  ( —  1) (7,  3,  8)  =  (0,  0,  0). 

•  The  list  (2,  3, 1),  (1,  —1,  2),  (7,  3,  c)  is  linearly  dependent  in  F3  if  and 
only  if  c  =  8,  as  you  should  verify. 

•  If  some  vector  in  a  list  of  vectors  in  V  is  a  linear  combination  of  the 
other  vectors,  then  the  list  is  linearly  dependent.  (Proof:  After  writing 
one  vector  in  the  list  as  equal  to  a  linear  combination  of  the  other 
vectors,  move  that  vector  to  the  other  side  of  the  equation,  where  it  will 
be  multiplied  by  —1.) 

•  Every  list  of  vectors  in  V  containing  the  0  vector  is  linearly  dependent. 
(This  is  a  special  case  of  the  previous  bullet  point.) 
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The  lemma  below  will  often  be  useful.  It  states  that  given  a  linearly 
dependent  list  of  vectors,  one  of  the  vectors  is  in  the  span  of  the  previous  ones 
and  furthermore  we  can  throw  out  that  vector  without  changing  the  span  of 
the  original  list. 


2.21  Linear  Dependence  Lemma 

Suppose  vi, . . . ,  vm  is  a  linearly  dependent  list  in  V.  Then  there  exists 
j  G  {1, 2 , ,m}  such  that  the  following  hold: 


(a)  v j  G  span(vi, . . . , 

(b)  if  the  j th  term  is  removed  from  v\ , . . . ,  vm,  the  span  of  the  remain¬ 
ing  list  equals  span(vi , . . . ,  vm). 


Proof  Because  the  list  v\ , . . . ,  vm  is  linearly  dependent,  there  exist  numbers 
a\ , . . . ,  am  G  F,  not  all  0,  such  that 


Cl\V\  +  •  •  •  +  amvm  —  0. 


Let  j  be  the  largest  element  of  {1, . . . ,  m}  such  that  aj  ^  0.  Then 


2.22 


CL  i  LI  j  — i 

- Vl - Vj- 1, 

aj  aj 


proving  (a). 

To  prove  (b),  suppose  u  G  span(vi, . . . ,  vm).  Then  there  exist  numbers 
c\ , . . . ,  cm  G  F  such  that 


u  —  c  i  vi  +  •  •  •  +  cmvm. 

In  the  equation  above,  we  can  replace  Vj  with  the  right  side  of  2.22,  which 
shows  that  u  is  in  the  span  of  the  list  obtained  by  removing  the  j th  term  from 
vi , . . . ,  vm •  Thus  (b)  holds.  ■ 

Choosing  j  =  1  in  the  Linear  Dependence  Lemma  above  means  that 
vi  =  0,  because  if  j  =  1  then  condition  (a)  above  is  interpreted  to  mean  that 
vi  G  span( );  recall  that  span( )  =  {0}.  Note  also  that  the  proof  of  part  (b) 
above  needs  to  be  modified  in  an  obvious  way  if  vi  =0  and  j  —  1. 

In  general,  the  proofs  in  the  rest  of  the  book  will  not  call  attention  to 
special  cases  that  must  be  considered  involving  empty  lists,  lists  of  length  1, 
the  subspace  {0},  or  other  trivial  cases  for  which  the  result  is  clearly  true  but 
needs  a  slightly  different  proof.  Be  sure  to  check  these  special  cases  yourself. 

Now  we  come  to  a  key  result.  It  says  that  no  linearly  independent  list  in  V 
is  longer  than  a  spanning  list  in  V. 
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2.23  Length  of  linearly  independent  list  <  length  of  spanning  list 

In  a  finite-dimensional  vector  space,  the  length  of  every  linearly  indepen¬ 
dent  list  of  vectors  is  less  than  or  equal  to  the  length  of  every  spanning  list 
of  vectors. 

Proof  Suppose  u\, . . . ,  um  is  linearly  independent  in  V.  Suppose  also  that 
w\, . . . ,  wn  spans  V.  We  need  to  prove  that  m  <  n.  We  do  so  through  the 
multi-step  process  described  below;  note  that  in  each  step  we  add  one  of  the 
w’s  and  remove  one  of  the  w’s. 

Step  1 

Let  B  be  the  list  w\ , . . . ,  ww,  which  spans  V.  Thus  adjoining  any  vector 
in  V  to  this  list  produces  a  linearly  dependent  list  (because  the  newly 
adjoined  vector  can  be  written  as  a  linear  combination  of  the  other 
vectors).  In  particular,  the  list 

u\,w  i, .  ..,wn 

is  linearly  dependent.  Thus  by  the  Linear  Dependence  Lemma  (2.21), 
we  can  remove  one  of  the  w’s  so  that  the  new  list  B  (of  length  n) 
consisting  of  u\  and  the  remaining  w’s  spans  V. 

Step  j 

The  list  B  (of  length  n)  from  step  j  —  1  spans  V.  Thus  adjoining  any 
vector  to  this  list  produces  a  linearly  dependent  list.  In  particular,  the 
list  of  length  (n  +  1)  obtained  by  adjoining  Uj  to  B ,  placing  it  just  after 
w  i , . . . ,  Uj -i ,  is  linearly  dependent.  By  the  Linear  Dependence  Lemma 
(2.21),  one  of  the  vectors  in  this  list  is  in  the  span  of  the  previous  ones, 
and  because  u\, ...  ,Uj  is  linearly  independent,  this  vector  is  one  of 
the  w’s,  not  one  of  the  i/’s.  We  can  remove  that  w  from  B  so  that  the 
new  list  B  (of  length  n )  consisting  of  u\ , . . . ,  uj  and  the  remaining  w’s 
spans  V. 

After  step  m,  we  have  added  all  the  w’s  and  the  process  stops.  At  each  step  as 
we  add  a  u  to  B,  the  Linear  Dependence  Lemma  implies  that  there  is  some  w 
to  remove.  Thus  there  are  at  least  as  many  w’s  as  w’s.  ■ 

The  next  two  examples  show  how  the  result  above  can  be  used  to  show, 
without  any  computations,  that  certain  lists  are  not  linearly  independent  and 
that  certain  lists  do  not  span  a  given  vector  space. 
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2.24  Example  Show  that  the  list  (1,2,  3),  (4,  5,  8),  (9,  6, 7),  (—3, 2,  8)  is 
not  linearly  independent  in  R3 . 

Solution  The  list  (1, 0, 0),  (0, 1, 0),  (0, 0, 1)  spans  R3.  Thus  no  list  of  length 
larger  than  3  is  linearly  independent  in  R3 . 


2.25  Example  Show  that  the  list  (1, 2,  3,  —5),  (4,  5,  8,  3),  (9,  6,  7,  —1) 
does  not  span  R4. 

Solution  The  list  (1,0, 0, 0),  (0, 1,0, 0),  (0, 0, 1,0),  (0, 0, 0,  1)  is  linearly  in¬ 
dependent  in  R4.  Thus  no  list  of  length  less  than  4  spans  R4. 

Our  intuition  suggests  that  every  subspace  of  a  finite-dimensional  vector 
space  should  also  be  finite-dimensional.  We  now  prove  that  this  intuition  is 
correct. 

2.26  Finite-dimensional  subspaces 

Every  subspace  of  a  finite-dimensional  vector  space  is  finite-dimensional. 

Proof  Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  We  need  to 
prove  that  U  is  finite-dimensional.  We  do  this  through  the  following  multi-step 
construction. 

Step  1 

If  U  =  {0},  then  U  is  finite-dimensional  and  we  are  done.  If  U  ^  {0}, 
then  choose  a  nonzero  vector  v\  e  U. 


Step  j 

If  U  —  span(vi, . . . ,  Vj- 1),  then  U  is  finite-dimensional  and  we  are 
done.  If  U  ^  span(vi, . . . ,  v/_ i),  then  choose  a  vector  v j  e  U  such 
that 

Vj  £  span(vi, . . . ,  vj-i). 

After  each  step,  as  long  as  the  process  continues,  we  have  constructed  a  list  of 
vectors  such  that  no  vector  in  this  list  is  in  the  span  of  the  previous  vectors. 
Thus  after  each  step  we  have  constructed  a  linearly  independent  list,  by  the 
Linear  Dependence  Lemma  (2.21).  This  linearly  independent  list  cannot  be 
longer  than  any  spanning  list  of  V  (by  2.23).  Thus  the  process  eventually 
terminates,  which  means  that  U  is  finite-dimensional.  ■ 
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EXERCISES  2. A 

1  Suppose  vi ,  v2,  V3 ,  V4  spans  V.  Prove  that  the  list 

vi  -  v2,v2  -  v3,v3  -  v4,  v4 

also  spans  F. 

2  Verify  the  assertions  in  Example  2. 18. 

3  Find  a  number  t  such  that 

(3,  1,4),  (2,  -3,  5),  (5,  9,0 
is  not  linearly  independent  in  R3. 

4  Verify  the  assertion  in  the  second  bullet  point  in  Example  2.20. 

5  (a)  Show  that  if  we  think  of  C  as  a  vector  space  over  R,  then  the  list 

(1  +  i,  1  —  i)  is  linearly  independent. 

(b)  Show  that  if  we  think  of  C  as  a  vector  space  over  C,  then  the  list 
(1  +  i,  1  —  i)  is  linearly  dependent. 

6  Suppose  vi ,  v2,  v3,  V4  is  linearly  independent  in  F  Prove  that  the  list 

vi  -  v2,v2  -  v3,v3  -  v4,  v4 
is  also  linearly  independent. 

7  Prove  or  give  a  counterexample:  If  v\ ,  v2, . . . ,  vm  is  a  linearly  indepen¬ 
dent  list  of  vectors  in  F,  then 

5vi  -  4v2,  v2,  v3,  . . . ,  vm 

is  linearly  independent. 

8  Prove  or  give  a  counterexample:  If  v\ ,  v2, . . . ,  vm  is  a  linearly  indepen¬ 
dent  list  of  vectors  in  F  and  X  e  F  with  A  /  0,  then  Avi ,  Av2, . . . ,  Xvm 
is  linearly  independent. 

9  Prove  or  give  a  counterexample:  If  v\ , . . . ,  vm  and  w  1 , . . . ,  wm  are  lin¬ 
early  independent  lists  of  vectors  in  F,  then  v\  +  w\ , . . . ,  vm  +  wm  is 
linearly  independent. 

10  Suppose  vi , . . . ,  vm  is  linearly  independent  in  V  and  w  e  V.  Prove  that 
if  v\  +  w, . . . ,  vm  +  w  is  linearly  dependent,  then  w  e  span(vi , . . . ,  vm). 
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11  Suppose  vi , . . . ,  vm  is  linearly  independent  in  V  and  w  e  V.  Show  that 
vi , . . . ,  vm ,  w  is  linearly  independent  if  and  only  if 

w  $£  span(vi, . . . ,  vm). 

12  Explain  why  there  does  not  exist  a  list  of  six  polynomials  that  is  linearly 
independent  in  TV^F). 

13  Explain  why  no  list  of  four  polynomials  spans  V4(F). 

14  Prove  that  V  is  infinite-dimensional  if  and  only  if  there  is  a  sequence 
vi ,  V2,  •  •  •  of  vectors  in  V  such  that  v\, ...  ,vm  is  linearly  independent 
for  every  positive  integer  m. 

15  Prove  that  F°°  is  infinite-dimensional. 

16  Prove  that  the  real  vector  space  of  all  continuous  real-valued  functions 
on  the  interval  [0, 1]  is  infinite-dimensional. 

17  Suppose  po ,  pi , . . . ,  pm  are  polynomials  in  Vm  (F)  such  that  pj  (2)  =  0 
for  each  j .  Prove  that  po,  p\, . . . ,  pm  is  not  linearly  independent  in 
Vm(F). 


SECTION  2.B  Bases 
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Bases 

In  the  last  section,  we  discussed  linearly  independent  lists  and  spanning  lists. 
Now  we  bring  these  concepts  together. 

2.27  Definition  basis 

A  basis  of  V  is  a  list  of  vectors  in  V  that  is  linearly  independent  and 
spans  V. 


2.B 


2.28  Example  bases 


(a)  The  list  (1, 0, . . . ,  0),  (0, 1, 0, . . . ,  0), . . . ,  (0, . . . ,  0, 1)  is  a  basis  of  Fn, 
called  the  standard  basis  of  Fn . 


(b)  The  list  ( 1 , 2) ,  (3 ,  5)  is  a  basis  of  F2 . 

(c)  The  list  (1,  2,  —4),  (7,  —5,  6)  is  linearly  independent  in  F3  but  is  not  a 
basis  of  F3  because  it  does  not  span  F3 . 

(d)  The  list  (1,  2),  (3,  5),  (4, 13)  spans  F2  but  is  not  a  basis  of  F2  because 
it  is  not  linearly  independent. 

(e)  The  list  (1, 1,  0),  (0,  0, 1)  is  a  basis  of  {(x,  x,  y)  G  F3  :  x,  y  G  F}. 

(f)  The  list  (1,  —1,  0),  (1,  0,  —1)  is  a  basis  of 

{(x,  y,  z)  G  F3  :  x  +  y  +  z  =  0}. 

(g)  The  list  1,  z, . . . ,  zm  is  a  basis  of  Vm(F)- 


In  addition  to  the  standard  basis,  Fw  has  many  other  bases.  For  example, 
(7,  5),  (—4,  9)  and  (1,  2),  (3,  5)  are  both  bases  of  F2. 

The  next  result  helps  explain  why  bases  are  useful.  Recall  that  “uniquely” 
means  “in  only  one  way”. 


2.29  Criterion  for  basis 

A  list  vi , . . . ,  vn  of  vectors  in  V  is  a  basis  of  V  if  and  only  if  every  v  G  V 
can  be  written  uniquely  in  the  form 

2.30  v  =  a ivi  H - V  anvn. 


,  djj  G  F . 


where  a  \ , 


•  •  • 
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Proof  First  suppose  that  v\, . . . ,  vn  is  a  basis  of  V.  Let  v  e  V.  Because 
vi, . . . ,  vn  spans  V,  there  exist  a\,...,an  e  F  such  that  2.30  holds.  To 

show  that  the  representation  in  2.30  is 
unique,  suppose  c\, . . . , cn  are  scalars 
such  that  we  also  have 


v  =  civi  H - h  cnvn. 


This  proof  is  essentially  a  repeti¬ 
tion  of  the  ideas  that  led  us  to  the 
definition  of  linear  independence. 


Subtracting  the  last  equation  from  2.30,  we  get 

0  =  (a\  —  c  i)vi  +  •  •  •  +  (an  —  cn)vn. 

This  implies  that  each  aj  —  Cj  equals  0  (because  v\ , . . . ,  vn  is  linearly  inde¬ 
pendent).  Hence  a  \  =  c\,...,an  =  cn.  We  have  the  desired  uniqueness, 
completing  the  proof  in  one  direction. 

For  the  other  direction,  suppose  every  v  e  V  can  be  written  uniquely  in 
the  form  given  by  2.30.  Clearly  this  implies  that  v\ , . . . ,  vn  spans  V.  To  show 
that  vi , . . . ,  vn  is  linearly  independent,  suppose  a\ , . . . ,  an  e  F  are  such  that 

0  =  a\v\  +  •••  +  cl  nvn. 

The  uniqueness  of  the  representation  2.30  (taking  v  =  0)  now  implies  that 
a  i  =  •  •  •  =  an  =0.  Thus  v\ , . . . ,  vn  is  linearly  independent  and  hence  is  a 
basis  of  V.  m 

A  spanning  list  in  a  vector  space  may  not  be  a  basis  because  it  is  not 
linearly  independent.  Our  next  result  says  that  given  any  spanning  list,  some 
(possibly  none)  of  the  vectors  in  it  can  be  discarded  so  that  the  remaining  list 
is  linearly  independent  and  still  spans  the  vector  space. 

As  an  example  in  the  vector  space  F2,  if  the  procedure  in  the  proof  below 
is  applied  to  the  list  (1, 2),  (3,  6),  (4,  7),  (5,  9),  then  the  second  and  fourth 
vectors  will  be  removed.  This  leaves  (1,  2),  (4,  7),  which  is  a  basis  of  F2. 

2.31  Spanning  list  contains  a  basis 

Every  spanning  list  in  a  vector  space  can  be  reduced  to  a  basis  of  the 
vector  space. 

Proof  Suppose  v\ , . . . ,  vn  spans  V.  We  want  to  remove  some  of  the  vectors 
from  v\ , . . . ,  vn  so  that  the  remaining  vectors  form  a  basis  of  V.  We  do  this 
through  the  multi-step  process  described  below. 

Start  with  B  equal  to  the  list  v\ , . . . ,  vn . 


SECTION  2.B  Bases 


41 


Step  1 

If  vi  =0,  delete  vi  from  B.  If  vi  ^  0,  leave  5  unchanged. 


Step  j 

If  Vy  is  in  span(vi, . . . ,  v7_i),  delete  vj  from  B.  If  vj  is  not  in 
span(vi, . . . ,  Vj- 1),  leave  B  unchanged. 

Stop  the  process  after  step  n ,  getting  a  list  B.  This  list  B  spans  V  because  our 
original  list  spanned  V  and  we  have  discarded  only  vectors  that  were  already 
in  the  span  of  the  previous  vectors.  The  process  ensures  that  no  vector  in  B 
is  in  the  span  of  the  previous  ones.  Thus  B  is  linearly  independent,  by  the 
Linear  Dependence  Lemma  (2.21).  Hence  B  is  a  basis  of  V.  m 

Our  next  result,  an  easy  corollary  of  the  previous  result,  tells  us  that  every 
finite-dimensional  vector  space  has  a  basis. 

2.32  Basis  of  finite-dimensional  vector  space 

Every  finite-dimensional  vector  space  has  a  basis. 

Proof  By  definition,  a  finite-dimensional  vector  space  has  a  spanning  list. 
The  previous  result  tells  us  that  each  spanning  list  can  be  reduced  to  a  basis.  ■ 

Our  next  result  is  in  some  sense  a  dual  of  2.31,  which  said  that  every 
spanning  list  can  be  reduced  to  a  basis.  Now  we  show  that  given  any  linearly 
independent  list,  we  can  adjoin  some  additional  vectors  (this  includes  the 
possibility  of  adjoining  no  additional  vectors)  so  that  the  extended  list  is  still 
linearly  independent  but  also  spans  the  space. 

2.33  Linearly  independent  list  extends  to  a  basis 

Every  linearly  independent  list  of  vectors  in  a  finite-dimensional  vector 
space  can  be  extended  to  a  basis  of  the  vector  space. 

Proof  Suppose  u\, ...  ,um  is  linearly  independent  in  a  finite-dimensional 
vector  space  V.  Let  w\ , . . . ,  wn  be  a  basis  of  V.  Thus  the  list 

u  i ,  •  •  • ,  um ,  n’t , . . . ,  wn 

spans  V.  Applying  the  procedure  of  the  proof  of  2.31  to  reduce  this  list  to  a 
basis  of  V  produces  a  basis  consisting  of  the  vectors  u\, ...  ,um  (none  of  the 
w’s  get  deleted  in  this  procedure  because  u\ , . . . ,  um  is  linearly  independent) 
and  some  of  the  w’s.  ■ 
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As  an  example  in  F3,  suppose  we  start  with  the  linearly  independent 
list  (2,  3,  4),  (9,  6,  8).  If  we  take  w\,  W2,  W3  in  the  proof  above  to  be  the 
standard  basis  of  F3,  then  the  procedure  in  the  proof  above  produces  the  list 
(2,  3,  4),  (9,  6,  8),  (0, 1,  0),  which  is  a  basis  of  F3. 

As  an  application  of  the  result  above, 
we  now  show  that  every  subspace  of  a 
finite-dimensional  vector  space  can  be 
paired  with  another  subspace  to  form  a 
direct  sum  of  the  whole  space. 

2.34  Every  subspace  of  V  is  part  of  a  direct  sum  equal  to  V 

Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  Then  there  is  a 
subspace  W  of  V  such  that  V  =  U  ©  W. 

Proof  Because  V  is  finite-dimensional,  so  is  U  (see  2.26).  Thus  there  is 
a  basis  u\, ...  ,um  of  U  (see  2.32).  Of  course  u\, . . . ,  um  is  a  linearly  in¬ 
dependent  list  of  vectors  in  V.  Hence  this  list  can  be  extended  to  a  basis 
u  1, . . . ,  um,w\, . . . ,  wn  of  V  (see  2.33).  Let  W  =  span(wi, . . . ,  wn). 

To  prove  that  V  =  U  ©  W,  by  1.45  we  need  only  show  that 

V  =  U  +  W  and  U  HW  =  {0}. 

To  prove  the  first  equation  above,  suppose  v  e  V.  Then,  because  the  list 
u  1 , . . . ,  um ,  w\ , . . . ,  wn  spans  V,  there  exist  a  1 , . . . ,  am ,  b\ , . . . ,  bn  e  F  such 
that 

v  —  a\U\  amum  +  b\w\  +  •  •  •  +  bnwn  . 

s - V - '  ^ - V - ' 

u  w 

In  other  words,  we  have  v  =  u  +  w,  where  u  e  U  and  w  e  W  are  defined  as 
above.  Thus  v  e  U  +  W,  completing  the  proof  that  V  =  U  +  W. 

To  show  that  U  D  W  =  {0},  suppose  v  e  U  D  W.  Then  there  exist  scalars 
a  1 , . . . ,  am ,  b\ , . . . ,  bn  gF  such  that 

V  =  (2\Ui  CLmUm  =  b\W\  +  •  •  •  +  bnwn. 

Thus 

d\\i\  -h  •  •  •  T-  d ml^m  b\W\  •••  bjjWfi  —  0. 

Because  u\, . . . ,  um,  w\, . . . ,  wn  is  linearly  independent,  this  implies  that 
d\  —  •  •  •  =  dm  —  b\  =  •  •  •  =  bn  =0.  Thus  v  =  0,  completing  the  proof 
that  UnW  =  {0}.  ■ 


Using  the  same  basic  ideas  but 
considerably  more  advanced  tools, 
the  next  result  can  be  proved  with¬ 
out  the  hypothesis  that  V  is  finite- 
dimensional. 
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EXERCISES  2.B 

1  Find  all  vector  spaces  that  have  exactly  one  basis. 

2  Verify  all  the  assertions  in  Example  2.28. 

3  (a)  Let  U  be  the  subspace  of  R5  defined  by 

U  =  {(xi,x2,  X3,  X4,  X5)  G  R5  :  x\  =  3x2  and  X3  =  7x4}. 

Find  a  basis  of  U. 

(b)  Extend  the  basis  in  part  (a)  to  a  basis  of  R5. 

(c)  Find  a  subspace  W  of  R5  such  that  R5  =  U  ©  W. 

4  (a)  Let  U  be  the  subspace  of  C5  defined  by 

U  =  {(zi,  Z2, 23,  Z4,  Z5)  G  C5  :  6zi  =  Z2  and  Z3  +  2Z4  +  3Z5  =  0}. 
Find  a  basis  of  U. 

(b)  Extend  the  basis  in  part  (a)  to  a  basis  of  C5. 

(c)  Find  a  subspace  W  of  C5  such  that  C5  =  U  ©  W. 

5  Prove  or  disprove:  there  exists  a  basis  po,  p\ ,  P2,  P3  of  V3 (F)  such  that 
none  of  the  polynomials  po,  p\ ,  P2,  P3  has  degree  2. 

6  Suppose  vi ,  V2,  V3 ,  V4  is  a  basis  of  V.  Prove  that 


vi  +  v2,  v2  +  v3,  v3  +  v4,  v4 


is  also  a  basis  of  V. 

7  Prove  or  give  a  counterexample:  If  v\ ,  V2,  V3,  V4  is  a  basis  of  V  and  U 
is  a  subspace  of  V  such  that  v\,  V2  G  U  and  V3  U  and  V4  ^  U,  then 
vi,  V2  is  a  basis  of  U. 

8  Suppose  U  and  W  are  subspaces  of  V  such  that  V  =  C/  ©  W.  Suppose 
also  that  u\ , . . . ,  um  is  a  basis  of  U  and  w\,...,wn  is  a  basis  of  W. 
Prove  that 

U  \ ,  ..  .  ,  Uffi,  Wi,..., 


is  a  basis  of  L. 
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Dimension 


Although  we  have  been  discussing  finite-dimensional  vector  spaces,  we  have 
not  yet  defined  the  dimension  of  such  an  object.  How  should  dimension  be 
defined?  A  reasonable  definition  should  force  the  dimension  of  Fn  to  equal  n. 
Notice  that  the  standard  basis 


2.C 


(1,0 . 0),(0, 1,0 . 0) . (0 . 0,1) 


of  Fn  has  length  n.  Thus  we  are  tempted  to  define  the  dimension  as  the  length 
of  a  basis.  However,  a  finite-dimensional  vector  space  in  general  has  many 
different  bases,  and  our  attempted  definition  makes  sense  only  if  all  bases  in  a 
given  vector  space  have  the  same  length.  Fortunately  that  turns  out  to  be  the 
case,  as  we  now  show. 


2.35  Basis  length  does  not  depend  on  basis 

Any  two  bases  of  a  finite-dimensional  vector  space  have  the  same  length. 

Proof  Suppose  V  is  finite-dimensional.  Let  B\  and  B2  be  two  bases  of  V. 
Then  B\  is  linearly  independent  in  V  and  B2  spans  V,  so  the  length  of  B\  is 
at  most  the  length  of  B2  (by  2.23).  Interchanging  the  roles  of  B\  and  B2 ,  we 
also  see  that  the  length  of  B2  is  at  most  the  length  of  B\ .  Thus  the  length  of 
B\  equals  the  length  of  B2 ,  as  desired.  ■ 

Now  that  we  know  that  any  two  bases  of  a  finite-dimensional  vector  space 
have  the  same  length,  we  can  formally  define  the  dimension  of  such  spaces. 

2.36  Definition  dimension ,  dim  V 

•  The  dimension  of  a  finite-dimensional  vector  space  is  the  length  of 
any  basis  of  the  vector  space. 

•  The  dimension  of  V  (if  V  is  finite-dimensional)  is  denoted  by  dim  V. 


2.37  Example  dimensions 

•  dimFw  =  n  because  the  standard  basis  of  Fn  has  length  n. 

•  dimVm(P)  =  m  +  1  because  the  basis  l,z, . . .  ,zm  of  Vm(F)  has 
length  m  +  1 . 
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Every  subspace  of  a  finite-dimensional  vector  space  is  finite-dimensional 
(by  2.26)  and  so  has  a  dimension.  The  next  result  gives  the  expected  inequality 
about  the  dimension  of  a  subspace. 

2.38  Dimension  of  a  subspace 

If  V  is  finite-dimensional  and  U  is  a  subspace  of  V,  then  dim  U  <  dim  V. 

Proof  Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  Think  of  a 
basis  of  U  as  a  linearly  independent  list  in  V,  and  think  of  a  basis  of  V  as  a 
spanning  list  in  V.  Now  use  2.23  to  conclude  that  dim  U  <  dim  V.  m 

To  check  that  a  list  of  vectors  in  V  r22  ,  "  02  7  7TL 

The  real  vector  space  R  has  di- 

is  a  basis  of  V,  we  must,  according  to  mension  2;  the  complex  vector 

the  definition,  show  that  the  list  in  ques-  space  C  has  dimension  1.  As 

tion  satisfies  two  properties:  it  must  be  sets,  R2  can  be  identified  with  C 
linearly  independent  and  it  must  span  (and  addition  is  the  same  on  both 
V.  The  next  two  results  show  that  if  the  spaces,  as  is  scalar  multiplication 

list  in  question  has  the  right  length,  then  ^  real  numbers).  Thm  when  we 

t  t  t  t  ^  ^  ~  talk  about  the  dimension  of  a  vec- 

we  need  only  check  that  it  satisfies  one  ,  ,  ,  ;  ,  , 

tor  space,  the  role  played  by  the 

of  the  two  required  properties.  First  we  chmce  ()f  F  cannot  be  neglected 

prove  that  every  linearly  independent  - 

list  with  the  right  length  is  a  basis. 

2.39  Linearly  independent  list  of  the  right  length  is  a  basis 

Suppose  V  is  finite-dimensional.  Then  every  linearly  independent  list  of 
vectors  in  V  with  length  dim  V  is  a  basis  of  V. 

Proof  Suppose  dim  V  —  n  and  v\ , . . . ,  vn  is  linearly  independent  in  V.  The 
list  v\ , . . . ,  vn  can  be  extended  to  a  basis  of  V  (by  2.33).  However,  every  basis 
of  V  has  length  n,  so  in  this  case  the  extension  is  the  trivial  one,  meaning  that 
no  elements  are  adjoined  to  v\ , . . . ,  vn .  In  other  words,  v\ , . . . ,  vn  is  a  basis 
of  V,  as  desired.  ■ 


The  real  vector  space  R2  has  di¬ 
mension  2;  the  complex  vector 
space  C  has  dimension  1.  As 
sets,  R2  can  be  identified  with  C 
( and  addition  is  the  same  on  both 
spaces,  as  is  scalar  multiplication 
by  real  numbers).  Thus  when  we 
talk  about  the  dimension  of  a  vec¬ 
tor  space,  the  role  played  by  the 
choice  of  F  cannot  be  neglected. 


2.40  Example  Show  that  the  list  (5, 7),  (4,  3)  is  a  basis  of  F2. 

Solution  This  list  of  two  vectors  in  F2  is  obviously  linearly  independent 
(because  neither  vector  is  a  scalar  multiple  of  the  other).  Note  that  F2  has 
dimension  2.  Thus  2.39  implies  that  the  linearly  independent  list  (5,  7),  (4,  3) 
of  length  2  is  a  basis  of  F2  (we  do  not  need  to  bother  checking  that  it  spans  F2). 
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2.41  Example  Show  that  1,  (x  —  5)2,  (x  —  5)3  is  a  basis  of  the  subspace 
U  of  V3  (R)  defined  by 

U  =  {p  e  P3(R)  :  AS)  =  0}. 

Solution  Clearly  each  of  the  polynomials  1,  (x  —  5)2,  and  (x  —  5)3  is  in  U. 

Suppose  a,  b,c  e  R  and 

a  +  b(x  —  5)2  +  c(x  —  5)3  =  0 

for  every  x  e  R.  Without  explicitly  expanding  the  left  side  of  the  equation 
above,  we  can  see  that  the  left  side  has  acx3  term.  Because  the  right  side  has 
no  x3  term,  this  implies  that  c  —  0.  Because  c  =  0,  we  see  that  the  left  side 
has  a  bx2  term,  which  implies  that  b  —  0.  Because  b  —  c  =  0,  we  can  also 
conclude  that  a  =  0. 

Thus  the  equation  above  implies  that  a  =  b  =  c  =  0.  Hence  the  list 
1,  (x  —  5)2,  (x  —  5)3  is  linearly  independent  in  U. 

Thus  dim£/  >  3.  Because  U  is  a  subspace  of  ^(R),  we  know  that 
dim  U  <  dimT^R)  =  4  (by  2.38).  However,  dim  U  cannot  equal  4,  because 
otherwise  when  we  extend  a  basis  of  U  to  a  basis  of  V3  (R)  we  would  get  a 
list  with  length  greater  than  4.  Hence  dim  U  =  3.  Thus  2.39  implies  that  the 
linearly  independent  list  1,  (x  —  5)2,  (x  —  5)3  is  a  basis  of  U. 

Now  we  prove  that  a  spanning  list  with  the  right  length  is  a  basis. 

2.42  Spanning  list  of  the  right  length  is  a  basis 

Suppose  V  is  finite-dimensional.  Then  every  spanning  list  of  vectors  in  V 
with  length  dim  V  is  a  basis  of  V. 

Proof  Suppose  dim  V  —  n  and  v\ , . . . ,  vn  spans  V.  The  list  v\ , . . . ,  vn  can 
be  reduced  to  a  basis  of  V  (by  2.31).  However,  every  basis  of  V  has  length 
77 ,  so  in  this  case  the  reduction  is  the  trivial  one,  meaning  that  no  elements 
are  deleted  from  v\ , . . . ,  vn .  In  other  words,  v\ , . . . ,  vn  is  a  basis  of  V,  as 
desired.  ■ 

The  next  result  gives  a  formula  for  the  dimension  of  the  sum  of  two 
subspaces  of  a  finite-dimensional  vector  space.  This  formula  is  analogous 
to  a  familiar  counting  formula:  the  number  of  elements  in  the  union  of  two 
finite  sets  equals  the  number  of  elements  in  the  first  set,  plus  the  number  of 
elements  in  the  second  set,  minus  the  number  of  elements  in  the  intersection 
of  the  two  sets. 
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2.43  Dimension  of  a  sum 

If  U\  and  U2  are  subspaces  of  a  finite-dimensional  vector  space,  then 

dim(C/i  +  U2)  =  dim  U\  +  dim  U2  —  dim(C/i  Pi  C/2). 

Proof  Let  wi, . . . ,  um  be  a  basis  of  U\  Pi  C/2;  thus  dim(C/i  P  U2 )  =  m.  Be¬ 
cause  u\, . . .  ,um  is  a  basis  of  U\  P  C/2,  it  is  linearly  independent  in  U\ . 
Hence  this  list  can  be  extended  to  a  basis  u\, . . . ,  um,  v\, . . . ,  Vj  of  U\ 
(by  2.33).  Thus  dimC/i  =  m  +  j.  Also  extend  u\,...,um  to  a  basis 
u\, . . . ,  um,  wi, . . . ,  Wfc  of  C/2;  thus  dim  U2  =  m  +  k. 

We  will  show  that 

is  a  basis  of  Li  +  U2.  This  will  complete  the  proof,  because  then  we  will  have 

dim([/i  +  U2)  =  m  +  j  +  k 

=  (m  +  /)  +  (m  +  k)  —  m 
=  dim  Li  +  dim  U2  —  P  C/2). 

Clearly  span(w  1 , . . . ,  um ,  vi , . . . ,  Vj ,  w\ , . . . ,  w^)  contains  U\  and  U2  and 
hence  equals  U\  +  C/2.  So  to  show  that  this  list  is  a  basis  of  U\  +  C/2  we  need 
only  show  that  it  is  linearly  independent.  To  prove  this,  suppose 

a\U\  +  •  •  •  +  dmum  b\v  1  +  •  •  •  +  bj Vj  c\w\  +  •••  +  =  0, 

where  all  the  a’s,  V s,  and  c’s  are  scalars.  We  need  to  prove  that  all  the  a' s, 
V s,  and  c’s  equal  0.  The  equation  above  can  be  rewritten  as 

c\w\  H - b  =  -a\U\ - amum  -  b\v\ - bjVj , 

which  shows  that  ciwi  +  •  •  •  +  G  C/i .  All  the  w’s  are  in  C/2,  so  this 
implies  that  ciwi  +  •  •  •  +  e  U\  P  C/2.  Because  mi,  . . . ,  um  is  a  basis 
of  C/i  P  C/2,  we  can  write 

ciwi  +  •  •  •  +  CfcWfc  —  d\U\  +  •  •  •  +  dmum 

for  some  choice  of  scalars  d\ , . . . ,  dm .  But  u  1 , . . . ,  um ,  wi , . . . ,  is  linearly 
independent,  so  the  last  equation  implies  that  all  the  c’s  (and  <Z’s)  equal  0. 
Thus  our  original  equation  involving  the  a' s,  V s,  and  c’s  becomes 

+  *  *  *  +  dmum  +  b\v\  +  •  •  •  +  bj  Vj  =0. 

Because  the  list  u\ , . . . ,  um,  v\ , . . . ,  Vj  is  linearly  independent,  this  equation 
implies  that  all  the  a’ s  and  V s  are  0.  We  now  know  that  all  the  a’s,  V s,  and 
c’s  equal  0,  as  desired.  ■ 
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EXERCISES  2.C 


1  Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V  such  that 
dim  U  =  dim  V.  Prove  that  U  —  V. 

2  Show  that  the  subspaces  of  R2  are  precisely  {0},  R2,  and  all  lines  in  R2 
through  the  origin. 

3  Show  that  the  subspaces  of  R3  are  precisely  {0},  R3,  all  lines  in  R3 
through  the  origin,  and  all  planes  in  R3  through  the  origin. 

4  (a)  Let  U  —  {p  e  ViiF)  :  p{ 6)  =  0}.  Find  a  basis  of  U. 

(b)  Extend  the  basis  in  part  (a)  to  a  basis  of  V4  (F). 

(c)  Find  a  subspace  W  of  ^(F)  such  that  V4  (F)  =  U  ©  W. 

5  (a)  Let  U  =  {p  e  V4(R)  :  p"(6)  =  0}.  Find  a  basis  of  U. 

(b)  Extend  the  basis  in  part  (a)  to  a  basis  of  V4(R). 

(c)  Find  a  subspace  W  of  P4(R)  such  that  V4QV)  =  U  ®W. 

6  (a)  Let  U  =  {p  G  V4(F)  :  p(2)  =  /?(5)}.  Find  a  basis  of  U. 

(b)  Extend  the  basis  in  part  (a)  to  a  basis  of  V4(F). 

(c)  Find  a  subspace  W  of  V4(F)  such  that  ^(F)  =  U  ©  W. 

7  (a)  Let  U  =  {p  e  V4(F)  :  p(2)  =  /?(5)  =  /?(6)}.  Find  a  basis  of  £/. 

(b)  Extend  the  basis  in  part  (a)  to  a  basis  of  V4(F). 

(c)  Find  a  subspace  W  of  V4(F)  such  that  V4(F)  =  U  ®W. 

8  (a)  Let  U  =  {p  e  ^(R)  :  f\  p  =  0}.  Find  a  basis  of  U. 

(b)  Extend  the  basis  in  part  (a)  to  a  basis  of  V4(R). 

(c)  Find  a  subspace  W  of  P4(R)  such  that  ^(R)  =  U  ©  W. 

9  Suppose  vi , . . . ,  vm  is  linearly  independent  in  V  and  w  e  V.  Prove  that 

dim  span(vi  +  w, . . . ,  vm  +  w)  >  m  —  1 . 

10  Suppose  po ,  p\, . . . ,  pm  G  P(F)  are  such  that  each  pj  has  degree  j . 
Prove  that  .Pi ,  •  •  • ,  Pm  is  a  basis  of  Vm (F). 

11  Suppose  that  U  and  W  are  subspaces  of  R8  such  that  dimf/  =  3, 
dim  IF  =  5,  and  U  +  W  —  R8.  Prove  that  R8  =  U  ©  W. 
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12  Suppose  U  and  W  are  both  five-dimensional  subspaces  of  R9.  Prove 
that  U  n  W  ^  {0}. 

13  Suppose  U  and  W  are  both  4-dimensional  subspaces  of  C6.  Prove  that 
there  exist  two  vectors  in  U  D  W  such  that  neither  of  these  vectors  is  a 
scalar  multiple  of  the  other. 

14  Suppose  U\, ,  Um  are  finite-dimensional  subspaces  of  V.  Prove  that 
U\  +  •  •  •  +  Um  is  finite-dimensional  and 

dim(£/i  +  •  •  •  +  Um )  <  dim  U\  +  •  •  •  +  dim  Um. 

15  Suppose  V  is  finite-dimensional,  with  dim  V  =  n  >  1 .  Prove  that  there 
exist  1 -dimensional  subspaces  U\, ...  ,Un  of  V  such  that 

V  =  U\  ©  •  •  •  ©  Un. 

16  Suppose  U\, ... ,  Um  are  finite-dimensional  subspaces  of  V  such  that 
U\  +  •  •  •  +  Um  is  a  direct  sum.  Prove  that  U\  ©  •  •  •  ©  Um  is  finite¬ 
dimensional  and 


dim  U i  ©  •  •  •  ©  Um  =  dim  U \  +  •  •  •  +  dim  Um . 

[The  exercise  above  deepens  the  analogy  between  direct  sums  of  sub - 
spaces  and  disjoint  unions  of  subsets.  Specifically,  compare  this  exercise 
to  the  following  obvious  statement:  if  a  set  is  written  as  a  disjoint  union 
of  finite  subsets,  then  the  number  of  elements  in  the  set  equals  the  sum  of 
the  numbers  of  elements  in  the  disjoint  subsets .] 

17  You  might  guess,  by  analogy  with  the  formula  for  the  number  of  ele¬ 
ments  in  the  union  of  three  subsets  of  a  finite  set,  that  if  U\ ,  C/2,  U3  are 
subspaces  of  a  finite-dimensional  vector  space,  then 

dim(C/i  +U2  +  U3) 

=  dim  U\  +  dim  U2  +  dim  U3 
-  dim(C/i  n  U2)  -  dim(C/i  n  U3)  -  dim (U2  n  U3) 

+  dim(c/i  nu2n  U3). 

Prove  this  or  give  a  counterexample. 
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German  mathematician  Carl 
Friedrich  Gauss  ( 1777-1855 ),  who 
in  1809  published  a  method  for 
solving  systems  of  linear  equations. 
This  method,  now  called  Gaussian 
elimination,  was  also  used  in  a 
Chinese  book  published  over  1600 
years  earlier. 


Linear  Maps 

So  far  our  attention  has  focused  on  vector  spaces.  No  one  gets  excited  about 
vector  spaces.  The  interesting  part  of  linear  algebra  is  the  subject  to  which  we 
now  turn — linear  maps. 

In  this  chapter  we  will  frequently  need  another  vector  space,  which  we  will 
call  W,  in  addition  to  V.  Thus  our  standing  assumptions  are  now  as  follows: 

3.1  Notation  F,  v,  W 

•  F  denotes  R  or  C. 

•  V  and  W  denote  vector  spaces  over  F. 


LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  Fundamental  Theorem  of  Linear  Maps 

■  the  matrix  of  a  linear  map  with  respect  to  given  bases 

■  isomorphic  vector  spaces 

■  product  spaces 

■  quotient  spaces 

■  the  dual  space  of  a  vector  space  and  the  dual  of  a  linear  map 
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The  Vector  Space  of  Linear  Maps 

Definition  and  Examples  of  Linear  Maps 

Now  we  are  ready  for  one  of  the  key  definitions  in  linear  algebra. 

3.2  Definition  linear  map 

A  linear  map  from  V  to  W  is  a  function  T :  V  ->  W  with  the  following 
properties: 

additivity 

T(u  +  v)  =  Tu  +  Tv  for  all  u,  v  e  V; 

homogeneity 

T{Xv)  =  A(7v)  for  all  A  e  F  and  all  v  e  V. 

Note  that  for  linear  maps  we  often 
use  the  notation  T v  as  well  as  the  more 
standard  functional  notation  T  (v). 


Some  mathematicians  use  the 
term  linear  transformation,  which 
means  the  same  as  linear  map. 


3.A 


3.3  Notation  C{V,  W) 

The  set  of  all  linear  maps  from  V  to  W  is  denoted  C(V,  W). 

Let’s  look  at  some  examples  of  linear  maps.  Make  sure  you  verify  that 
each  of  the  functions  defined  below  is  indeed  a  linear  map: 

3.4  Example  linear  maps 
zero 

In  addition  to  its  other  uses,  we  let  the  symbol  0  denote  the  function  that  takes 
each  element  of  some  vector  space  to  the  additive  identity  of  another  vector 
space.  To  be  specific,  0  e  C(V,  W)  is  defined  by 

Ov  =  0. 

The  0  on  the  left  side  of  the  equation  above  is  a  function  from  V  to  W,  whereas 
the  0  on  the  right  side  is  the  additive  identity  in  W.  As  usual,  the  context 
should  allow  you  to  distinguish  between  the  many  uses  of  the  symbol  0. 

identity 

The  identity  map ,  denoted  /,  is  the  function  on  some  vector  space  that  takes 
each  element  to  itself.  To  be  specific,  I  e  C(V ,  V)  is  defined  by 

Iv  =  v. 
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differentiation 

Define  D  e  £(V(R),V(R))  by 

Dp  =  p' . 

The  assertion  that  this  function  is  a  linear  map  is  another  way  of  stating  a  basic 
result  about  differentiation:  (/  +  g)r  =  fr  +  gr  and  (A /)'  =  Xfr  whenever 
/,  g  are  differentiable  and  A  is  a  constant. 

integration 

Define  T  £  C(V(R),  R)  by 

Tp  —  /  p(x )  Jx. 

Jo 

The  assertion  that  this  function  is  linear  is  another  way  of  stating  a  basic  result 
about  integration:  the  integral  of  the  sum  of  two  functions  equals  the  sum 
of  the  integrals,  and  the  integral  of  a  constant  times  a  function  equals  the 
constant  times  the  integral  of  the  function. 

multiplication  by  x2 

Define  T  £  C(V{ R),  V(R))  by 

(T p)(x)  =  x2  p(x) 

for  x  £  R. 

backward  shift 

Recall  that  F°°  denotes  the  vector  space  of  all  sequences  of  elements  of  F. 
Define  T  £  £( F00^00)  by 

T  (x  \  ,  X2  5  X3 1  •  •  •  )  (X2  5  X3  5  •  •  •  )  • 

from  R3  to  R2 

Define  T  c  £(R3,  R2)  by 

T (x,  y,  z)  =  (2x  —  y  +  3z,  lx  +  5y  —  6z). 

from  Fw  to  Fm 

Generalizing  the  previous  example,  let  m  and  n  be  positive  integers,  let 
Aj £  £  F  for  j  =  1, . . . ,  m  and  k  =  1, . . . ,  n,  and  define  T  £  C(Fn ,  Fm)  by 

T(x  1,  .  .  .  ,  Xw)  =  (A  1,1X1  +  •  •  •  +  A\,nXn,  .  .  .  ,  ^4/n,l-Tl  +  •  *  *  + 

Actually  every  linear  map  from  Fn  to  Fm  is  of  this  form. 

The  existence  part  of  the  next  result  means  that  we  can  find  a  linear  map 
that  takes  on  whatever  values  we  wish  on  the  vectors  in  a  basis.  The  uniqueness 
part  of  the  next  result  means  that  a  linear  map  is  completely  determined  by  its 
values  on  a  basis. 
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3.5  Linear  maps  and  basis  of  domain 

Suppose  vi , . . . ,  vn  is  a  basis  of  V  and  w\ , . . . ,  wn  e  W.  Then  there  exists 
a  unique  linear  map  T :  V  ->  W  such  that 

T  vj  =  wj 


for  each  j  =  1 , . . . ,  n . 

Proof  First  we  show  the  existence  of  a  linear  map  T  with  the  desired  property. 
Define  T :  V  — >  W  by 

T{c\v\  cnvn)  =  c\w\  +•••  +  cnwn , 

where  c\ , . . . ,  cn  are  arbitrary  elements  of  F.  The  list  v\, ...  ,vn  is  a  basis 
of  V,  and  thus  the  equation  above  does  indeed  define  a  function  T  from 
V  to  W  (because  each  element  of  V  can  be  uniquely  written  in  the  form 
ClVi  +  •  •  •  +  cnvn). 

For  each  j ,  taking  Cj  =  1  and  the  other  c’s  equal  to  0  in  the  equation 
above  shows  that  T vj  =  Wj . 

If  u,v  e  V  with  u  =  a\v\  +  •  •  •  +  anvn  and  v  =  civi  +  •  •  •  +  cnvn,  then 
T(u  +  v)  =  T  {{ci\  +  ci)vi  +  •  •  •  +  ( an  + 

=  ( d\  +  C\)w\  +  *  *  *  +  ( CLn  +  Cn)wn 
=  (aiwi  +  •  •  •  +  dnWn)  +  (c\w\  +  •  •  •  +  cnwn ) 

=  Tu  +  Tv. 

Similarly,  if  X  e  F  and  v  =  civi  +  •  •  •  +  cnvn,  then 

T  (Av)  =  T  (Acivi  +  •  •  •  +  A  cnVn) 

—  Aciwi  +  •  •  •  +  A  cnwn 
—  A(ciwi  +  •  •  •  +  cnwn ) 

=  A  Tv. 

Thus  T  is  a  linear  map  from  V  to  W. 

To  prove  uniqueness,  now  suppose  that  T  e  C{V,  W )  and  that  T vy  =  Wj 
for  j  =  1, . . . ,  n.  Let  c\, . . . ,  cn  e  F.  The  homogeneity  of  T  implies  that 
T  (i Cj  Vj )  =  Cj  Wj  for  j  =  1 , ...  ,n.  The  additivity  of  T  now  implies  that 

T(c\V\  H - h  CnVn)  —  C\W\  J\ - b  Cnwn. 

Thus  T  is  uniquely  determined  on  span(vi, . . . ,  vn)  by  the  equation  above. 
Because  vi , . . . ,  vn  is  a  basis  of  V,  this  implies  that  T  is  uniquely  determined 
on  V.  m 


SECTION  3.  A  The  Vector  Space  of  Linear  Maps  55 

Algebraic  Operations  on  C(V,  W) 

We  begin  by  defining  addition  and  scalar  multiplication  on  C(V,  W). 

3.6  Definition  addition  and  scalar  multiplication  on  C(V,  W) 

Suppose  S,T  e  C(V,  W)  and  A  e  F.  The  sum  S  +  T  and  the  product 
XT  are  the  linear  maps  from  V  to  W  defined  by 

C S  +  T)(v)  =  Sv+ Tv  and  (A  T)(v)  =  A  (Tv) 

for  all  v  e  V. 

You  should  verify  that  S  +  T  and  AT  Although  linear  maps  are  perva- 
as  defined  above  are  indeed  linear  maps.  sive  throughout  mathematics,  they 
In  other  words,  if  S,T  G  £(K,  W)  and  are  not  as  ubiquitous  as  imagined 

A  e  F,  then  S  +  T  e  £(V,  IT)  and  by  some  confused  students  who 

£=  j/j/)  seem  to  think  that  cos  is  a  linear 

Because  we  took  the  trouble  to  de-  rnapfixm  R  to  R  when  they  write 

n  t  t  ,  .....  that  cos  2x  equals  2  cos  x  and  that 

fine  addition  and  scalar  multiplication  ,  ,  .  . 

r  cos(x  +  y)  equals  cos  x  +  cos  y. 

on  C(V,  W),  the  next  result  should  not  - 

be  a  surprise. 

3.7  C(V,  W)  is  a  vector  space 

With  the  operations  of  addition  and  scalar  multiplication  as  defined  above, 
C(V,  W)  is  a  vector  space. 

The  routine  proof  of  the  result  above  is  left  to  the  reader.  Note  that  the 
additive  identity  of  C(V,  W)  is  the  zero  linear  map  defined  earlier  in  this 
section. 

Usually  it  makes  no  sense  to  multiply  together  two  elements  of  a  vector 
space,  but  for  some  pairs  of  linear  maps  a  useful  product  exists.  We  will  need 
a  third  vector  space,  so  for  the  rest  of  this  section  suppose  U  is  a  vector  space 
over  F. 

3.8  Definition  Product  of  Linear  Maps 

If  T  e  C(U,  V )  and  5  e  C(V,  W),  then  the  product  ST  e  C(U ,  W)  is 
defined  by 

(ST)(m)  =  S(Tu) 


Although  linear  maps  are  perva¬ 
sive  throughout  mathematics,  they 
are  not  as  ubiquitous  as  imagined 
by  some  confused  students  who 
seem  to  think  that  cos  is  a  linear 
map  from  R  to  R  when  they  write 
that  cos  2x  equals  2  cos  x  and  that 
cos(x  +  y)  equals  cos  x  +  cos  y. 


for  u  G  U. 


56 


CHAPTER  3  Linear  Maps 


In  other  words,  S  T  is  just  the  usual  composition  S  o  T  of  two  functions, 
but  when  both  functions  are  linear,  most  mathematicians  write  S  T  instead 
of  S  o  T.  You  should  verify  that  S  T  is  indeed  a  linear  map  from  U  to  W 
whenever  T  e  £(U,  V )  and  S  e  C(V ,  W). 

Note  that  ST  is  defined  only  when  T  maps  into  the  domain  of  S . 

3.9  Algebraic  properties  of  products  of  linear  maps 

associativity 

(7\r2)r3  =  tx{t2t3) 

whenever  7\,  T2,  and  T2  are  linear  maps  such  that  the  products  make 
sense  (meaning  that  T2  maps  into  the  domain  of  T2,  and  T2  maps  into  the 
domain  of  T\). 

identity 

77  =  IT  =  T 

whenever  T  e  C(V,  W)  (the  first  7  is  the  identity  map  on  V,  and  the 
second  7  is  the  identity  map  on  W). 

distributive  properties 

(Sx  +  S2)T  =  SXT  +  S2T  and  S(TX  +  T2)  =  STX  +  ST2 
whenever  7,  TUT2  e  C{U ,  V)  and  5,  SuS2e  C(V,  W). 

The  routine  proof  of  the  result  above  is  left  to  the  reader. 

Multiplication  of  linear  maps  is  not  commutative.  In  other  words,  it  is  not 
necessarily  true  that  ST  =  TS,  even  if  both  sides  of  the  equation  make  sense. 

3.10  Example  Suppose  D  e  C(V(R),V(R))  is  the  differentiation  map 
defined  in  Example  3.4  and  T  e  ^(^(R),  ^(R))  is  the  multiplication  by  x2 
map  defined  earlier  in  this  section.  Show  that  TD  ^  DT. 

Solution  We  have 

(( TD)p)(x )  =  x2p\x )  but  (( DT)p)(x )  =  x2  p'  (x)  +  2  xp(x). 

In  other  words,  differentiating  and  then  multiplying  by  x2  is  not  the  same  as 
multiplying  by  x2  and  then  differentiating. 
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3.1 1  Linear  maps  take  0  to  0 

Suppose  T  is  a  linear  map  from  V  to  W.  Then  T (0)  =  0. 

Proof  By  additivity,  we  have 

T{  0)  =  T(0  +  0)  =  T(  0)  +  7X0). 

Add  the  additive  inverse  of  T  (0)  to  each  side  of  the  equation  above  to  conclude 
that  7(0)  =  0.  ■ 

EXERCISES  3. A 

1  Suppose  b,  c  G  R.  Define  T :  R3  ->  R1 2  by 


T(x,  y,  z )  =  ( 2x  —  4y  +  3z  +  b,  6x  +  cxyz). 


Show  that  T  is  linear  if  and  only  if  b  =  c  =  0. 


2  Suppose  b,  c  e  R.  Define  T :  7XR)  R2  by 


Tp  =  ^3/?(4)  +  5p'{6)  +  bp(\)p(2 ),  J  x3 p(x)  dx  +  c  sin  />(0)j. 

Show  that  T  is  linear  if  and  only  if  b  =  c  =  0. 

3  Suppose  T  g  £(Fn,Fm).  Show  that  there  exist  scalars  Aj^  e  F  for 
j  =  1 , ,m  and k  =  1 , . . . , /z  such  that 


for  every  (xi , . . . ,  xn)  G  Fn. 

[ The  exercise  above  shows  that  T  has  the  form  promised  in  the  last  item 
of  Example  3.4.] 

4  Suppose  T  G  C(V ,  fL)  and  v\ , . . . ,  vm  is  a  list  of  vectors  in  F  such  that 
Tv\ , . . . ,  Tvm  is  a  linearly  independent  list  in  fL.  Prove  that  vi , . . . ,  vm 
is  linearly  independent. 

5  Prove  the  assertion  in  3.7. 

6  Prove  the  assertions  in  3.9. 
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7  Show  that  every  linear  map  from  a  1 -dimensional  vector  space  to  itself  is 
multiplication  by  some  scalar.  More  precisely,  prove  that  if  dim  V  =  1 
and  T  g  £(V,  V ),  then  there  exists  A  G  F  such  that  Tv  =  Xv  for  all 
V  G  V. 

8  Give  an  example  of  a  function  <p :  R2  — >  R  such  that 

cp(av)  =  acpiy) 

for  all  a  G  R  and  all  v  G  R2  but  cp  is  not  linear. 

[ The  exercise  above  and  the  next  exercise  show  that  neither  homogeneity 
nor  additivity  alone  is  enough  to  imply  that  a  function  is  a  linear  map.] 

9  Give  an  example  of  a  function  cp :  C  ->  C  such  that 

cp(w  +  z)  —  <p(w)  +  <p(z) 

for  all  w,  z  G  C  but  cp  is  not  linear.  (Here  C  is  thought  of  as  a  complex 
vector  space.) 

[ There  also  exists  a  function  cp :  R  — >  R  such  that  cp  satisfies  the  additiv¬ 
ity  condition  above  but  cp  is  not  linear.  However,  showing  the  existence 
of  such  a  function  involves  considerably  more  advanced  tools.] 

10  Suppose  U  is  a  subspace  of  V  with  U  ^  V.  Suppose  S  G  C{U ,  W)  and 
5^0  (which  means  that  Su  ^  0  for  some  u  G  U ).  Define  T :  V  ->  W 
by 

(Sv  if  veU, 

Tv  =  l 

|0  if  v  G  V  and  v  ^  U. 

Prove  that  T  is  not  a  linear  map  on  V. 

11  Suppose  V  is  finite-dimensional.  Prove  that  every  linear  map  on  a 
subspace  of  V  can  be  extended  to  a  linear  map  on  V.  In  other  words, 
show  that  if  U  is  a  subspace  of  V  and  S  G  C{U ,  W),  then  there  exists 
T  G  C(V,  W)  such  that  Tu  =  S u  for  all  u  G  U. 

12  Suppose  V  is  finite-dimensional  with  dim  V  >  0,  and  suppose  W  is 
infinite-dimensional.  Prove  that  C(V ,  W)  is  infinite-dimensional. 

13  Suppose  vi , . . . ,  vm  is  a  linearly  dependent  list  of  vectors  in  V.  Suppose 
also  that  W  {0}.  Prove  that  there  exist  w i , . . . ,  wm  G  W  such  that  no 
T  g  C(V ,  W)  satisfies  Tvp  =  wp  for  each  k  =  1, . . . ,  m. 

14  Suppose  V  is  finite-dimensional  with  dim  V  >2.  Prove  that  there  exist 
SJg  £(7,  V)  such  that  ST  ^  TS. 
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Null  Spaces  and  Ranges 

Null  Space  and  Injectivity 

In  this  section  we  will  learn  about  two  subspaces  that  are  intimately  connected 
with  each  linear  map.  We  begin  with  the  set  of  vectors  that  get  mapped  to  0. 

3.12  Definition  null  space,  null  T 

For  T  g  C(V,  W),  the  null  space  of  T \  denoted  null  T,  is  the  subset  of  V 
consisting  of  those  vectors  that  T  maps  to  0: 

null  T  =  {v  G  V  :  Tv  =  0}. 


3.B 


3.13  Example  null  space 

•  If  T  is  the  zero  map  from  V  to  W,  in  other  words  if  Tv  =  0  for  every 
v  G  V,  then  null  T  —  V. 

•  Suppose  (p  G  £(C3,  F)  is  defined  by  cp(z\,Z2, 23)  =  z\  +  2z 2  +  3z3. 
Then  null  cp  —  {(z\ ,  Z2,  Z3)  G  C3  :  z\  +  2z2  +  3z3  =  0}.  A  basis  of 
null  cp  is  (—2, 1,  0),  (—3,  0,  1). 

•  Suppose  D  g  C(V(R),V(Rj)  is  the  differentiation  map  defined  by 
Dp  =  pr.  The  only  functions  whose  derivative  equals  the  zero  function 
are  the  constant  functions.  Thus  the  null  space  of  D  equals  the  set  of 
constant  functions. 

•  Suppose  T  €  £(P(R),P(R))  is  the  multiplication  by  x2  map  defined 
by  ( Tp)(x )  =  x2 p(x).  The  only  polynomial  p  such  that  x2p(x)  =  0 
for  all  X  G  R  is  the  0  polynomial.  Thus  null  T  =  {0}. 

•  Suppose  T  G  £(F°°,  F°°)  is  the  backward  shift  defined  by 

T  (x  \ ,  x^  1 X3  5  •  •  • )  (A2  ■>  X3  ?•••)• 

Clearly  T (xi ,  X2,  X3, . . . )  equals  0  if  and  only  if  X2,  X3, . . .  are  all  0. 
Thus  in  this  case  we  have  null  T  =  {(a,  0,  0, ... )  :  a  G  F}. 


The  next  result  shows  that  the  null 
space  of  each  linear  map  is  a  subspace 
of  the  domain.  In  particular,  0  is  in  the 
null  space  of  every  linear  map. 


Some  mathematicians  use  the  term 
kernel  instead  of  null  space.  The 
word  “null”  means  zero.  Thus  the 
term  “null  space” should  remind 
you  of  the  connection  to  0. 
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3.14  The  null  space  is  a  subspace 

Suppose  T  €  C(V ,  fL).  Then  null  T  is  a  subspace  of  V. 

Proof  Because  T  is  a  linear  map,  we  know  that  T(0)  =  0  (by  3.11).  Thus 
0  G  null  T. 

Suppose  u,  v  G  null  T.  Then 

T(u+v)  =  Tu  +  Tv  =  0  +  0  =  0. 

Hence  u  +  v  G  null  T.  Thus  null  T  is  closed  under  addition. 

Suppose  u  G  null  T  and  A  G  F.  Then 

T  (Xu)  =  A  T  a  =  A0  =  0. 

Hence  Xu  G  null  T.  Thus  null  T  is  closed  under  scalar  multiplication. 

We  have  shown  that  null  T  contains 
0  and  is  closed  under  addition  and  scalar 
multiplication.  Thus  null  T  is  a  sub¬ 
space  of  V  (by  1.34).  ■ 

As  we  will  soon  see,  for  a  linear  map 
the  next  definition  is  closely  connected  to  the  null  space. 

3.15  Definition  injective 

A  function  T :  V  ->  W  is  called  injective  if  Tu  =  Tv  implies  u  —  v. 


The  next  result  says  that  we  can  check  whether  a  linear  map  is  injective 
by  checking  whether  0  is  the  only  vector  that  gets  mapped  to  0.  As  a  simple 
application  of  this  result,  we  see  that  of  the  linear  maps  whose  null  spaces  we 
computed  in  3.13,  only  multiplication  by  x2  is  injective  (except  that  the  zero 
map  is  injective  in  the  special  case  V  =  {0}). 


Many  mathematicians  use  the  term 
one-to-one ,  which  means  the  same 
as  injective. 


The  definition  above  could  be 
rephrased  to  say  that  T  is  injective  if 
u  7^  v  implies  that  Tu  ^  Tv.  In  other 
words,  T  is  injective  if  it  maps  distinct 
inputs  to  distinct  outputs. 


Take  another  look  at  the  null  spaces 
that  were  computed  in  Example 
3.13  and  note  that  all  of  them  are 
subspaces. 
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3.16  Injectivity  is  equivalent  to  null  space  equals  {0} 

Let  T  G  C(V ,  W).  Then  T  is  injective  if  and  only  if  null  T  =  {0}. 

Proof  First  suppose  T  is  injective.  We  want  to  prove  that  null  T  =  {0}.  We 
already  know  that  {0}  C  null  T  (by  3.1 1).  To  prove  the  inclusion  in  the  other 
direction,  suppose  v  G  null  T.  Then 

T(v)  =  0  =  T{  0). 

Because  T  is  injective,  the  equation  above  implies  that  v  =  0.  Thus  we  can 
conclude  that  null  T  =  {0},  as  desired. 

To  prove  the  implication  in  the  other  direction,  now  suppose  null  T  =  {0}. 
We  want  to  prove  that  T  is  injective.  To  do  this,  suppose  u,v  G  V  and 
Tu  =  Tv.  Then 

0  =  Tu  —  Tv  =  T  (u  —  v). 

Thus  u  —  v  is  in  null  T,  which  equals  {0}.  Hence  u  —  v  =  0,  which  implies 
that  u  =  v.  Hence  T  is  injective,  as  desired.  ■ 

Range  and  Surjectivity 

Now  we  give  a  name  to  the  set  of  outputs  of  a  function. 

3.17  Definition  range 

For  T  a  function  from  V  to  W ,  the  range  of  T  is  the  subset  of  W  consisting 
of  those  vectors  that  are  of  the  form  T  v  for  some  v  G  V : 

range  T  =  {Tv  :  v  G  V}. 


3.18  Example  range 

•  If  T  is  the  zero  map  from  V  to  W,  in  other  words  if  Tv  =  0  for  every 
v  G  V,  then  range  T  =  {0}. 

•  Suppose  T  G  £(R2,R3)  is  defined  by  T(x,y)  =  (2x,5 y,x  +  y ), 
then  range  T  =  {(2x,  5 y,  x  +  y)  :  x,  y  G  R}.  A  basis  of  range  T  is 
(2,0,  1),  (0,5,1). 

•  Suppose  D  G  ^(^(R),  ^(R))  is  the  differentiation  map  defined  by 
Dp  =  pf.  Because  for  every  polynomial  q  G  P(R)  there  exists  a 
polynomial  p  G  7^(R)  such  that  pr  =  q,  the  range  of  D  is  V(R). 
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The  next  result  shows  that  the  range 
of  each  linear  map  is  a  subspace  of 
the  vector  space  into  which  it  is  being 
mapped. 

3.19  The  range  is  a  subspace 

If  T  g  C(V,  W),  then  range  T  is  a  subspace  of  W. 

Proof  Suppose  T  G  C(V,  W).  Then  T(0)  =  0  (by  3.11),  which  implies  that 
0  G  range  T. 

If  wi ,  W2  G  range  T,  then  there  exist  vi ,  V2  G  V  such  that  Tvi  =  w i  and 
Tv 2  =  W2-  Thus 


Some  mathematicians  use  the  word 
image,  which  means  the  same  as 
range. 


T  (vi  +  v2)  =  Tvi  +  Tv2  =  w\  +  w2. 

Hence  wi  +  W2  G  range  T.  Thus  range  T  is  closed  under  addition. 

If  w  G  range  T  and  A  G  F,  then  there  exists  v  G  V  such  that  Tv  =  w. 
Thus 

T  (Av)  =  A  Tv  =  Aw. 

Hence  Aw  G  range  T.  Thus  range  T  is  closed  under  scalar  multiplication. 

We  have  shown  that  range  T  contains  0  and  is  closed  under  addition  and 
scalar  multiplication.  Thus  range  T  is  a  subspace  of  W  (by  1.34).  ■ 

3.20  Definition  surjective 

A  function  T :  V  ->  W  is  called  surjective  if  its  range  equals  W. 

To  illustrate  the  definition  above,  note  that  of  the  ranges  we  computed  in 
3.18,  only  the  differentiation  map  is  surjective  (except  that  the  zero  map  is 
surjective  in  the  special  case  W  =  {0}. 

Whether  a  linear  map  is  surjective 
depends  on  what  we  are  thinking  of  as 
the  vector  space  into  which  it  maps. 

3.21  Example  The  differentiation  map  D  G  C(Vs{ R),  ^(R))  defined 
by  Dp  =  pf  is  not  surjective,  because  the  polynomial  x5  is  not  in  the  range 
of  D.  However,  the  differentiation  map  S  G  /^(^(R),  V4(R))  defined  by 
Sp  =  pf  is  surjective,  because  its  range  equals  T^(YV),  which  is  now  the 
vector  space  into  which  S  maps. 


Many  mathematicians  use  the  term 
onto,  which  means  the  same  as  sur¬ 
jective. 
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Fundamental  Theorem  of  Linear  Maps 

The  next  result  is  so  important  that  it  gets  a  dramatic  name. 

3.22  Fundamental  Theorem  of  Linear  Maps 

Suppose  V  is  finite-dimensional  and  T  e  C(V,  W).  Then  range  T  is 
finite-dimensional  and 


dim  V  =  dim  null  T  +  dim  range  T. 


Proof  Let  u  i , . . . ,  um  be  a  basis  of  null  T ;  thus  dim  null  T  =  m.  The  linearly 
independent  list  u\, ...  ,um  can  be  extended  to  a  basis 


of  V  (by  2.33).  Thus  dim  V  =  m  +  n.  To  complete  the  proof,  we  need  only 
show  that  range  T  is  finite-dimensional  and  dim  range  T  —  n.  We  will  do  this 
by  proving  that  TVi , . . . ,  Tvn  is  a  basis  of  range  T. 

Let  v  G  V.  Because  u  \ , . . . ,  um ,  v\ , . . . ,  vn  spans  V,  we  can  write 

V  =  a\U\  +  ••*  +  CLm^m  +  ^1  Vi  +  *  *  *  +  bnVn , 


where  the  a’s  and  b’s  are  in  F.  Applying  T  to  both  sides  of  this  equation,  we 
get 

Tv  —  b\Tv\  - 1-  bnTvn, 


where  the  terms  of  the  form  Tuj  disappeared  because  each  u  j  is  in  null  T. 
The  last  equation  implies  that  Tvi, . . . ,  Tvn  spans  range  T.  In  particular, 
range  T  is  finite-dimensional. 

To  show  Tv\ , . . . ,  Tvn  is  linearly  independent,  suppose  c\, ...  ,cn  e  F 
and 

c\Tv\  cnTvn  =  0. 


Then 


T(civi  +  •  •  •  +  cnvn)  —  0. 


Hence 


C\V\  J\ - b  cnvn  e  null  T. 


Because  u\, ...  ,um  spans  null  T,  we  can  write 


civi  cnvn  —  d\U\  +  •••  +  dmum, 


where  the  d’s  are  in  F.  This  equation  implies  that  all  the  c’s  (and  d’s)  are  0 
(because  u  \ , . . . ,  um ,  v\ , . . . ,  vn  is  linearly  independent).  Thus  Tv\, . . .  ,Tvn 
is  linearly  independent  and  hence  is  a  basis  of  range  T,  as  desired.  ■ 
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Now  we  can  show  that  no  linear  map  from  a  finite-dimensional  vector 
space  to  a  “smaller”  vector  space  can  be  injective,  where  “smaller”  is  measured 
by  dimension. 

3.23  A  map  to  a  smaller  dimensional  space  is  not  injective 

Suppose  V  and  W  are  finite-dimensional  vector  spaces  such  that 
dim  V  >  dim  W.  Then  no  linear  map  from  V  to  W  is  injective. 

Proof  Let  T  e  C(V,  W).  Then 

dim  null  T  =  dim  V  —  dim  range  T 
>  dim  V  —  dim  W 

>0, 

where  the  equality  above  comes  from  the  Fundamental  Theorem  of  Linear 
Maps  (3.22).  The  inequality  above  states  that  dim  null  T  >  0.  This  means 
that  null  T  contains  vectors  other  than  0.  Thus  T  is  not  injective  (by  3.16).  ■ 

The  next  result  shows  that  no  linear  map  from  a  finite-dimensional  vector 
space  to  a  “bigger”  vector  space  can  be  surjective,  where  “bigger”  is  measured 
by  dimension. 

3.24  A  map  to  a  larger  dimensional  space  is  not  surjective 

Suppose  V  and  W  are  finite-dimensional  vector  spaces  such  that 
dim  V  <  dim  W.  Then  no  linear  map  from  V  to  W  is  surjective. 

Proof  Let  T  e  C(V,  W).  Then 

dim  range  T  =  dim  V  —  dim  null  T 

<  dim  V 

<  dim  W , 

where  the  equality  above  comes  from  the  Fundamental  Theorem  of  Linear 
Maps  (3.22).  The  inequality  above  states  that  dim  range  T  <  dim  IF.  This 
means  that  range  T  cannot  equal  W.  Thus  T  is  not  surjective.  ■ 

As  we  will  now  see,  3.23  and  3.24  have  important  consequences  in  the 
theory  of  linear  equations.  The  idea  here  is  to  express  questions  about  systems 
of  linear  equations  in  terms  of  linear  maps. 
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3.25  Example  Rephrase  in  terms  of  a  linear  map  the  question  of  whether 
a  homogeneous  system  of  linear  equations  has  a  nonzero  solution. 

Solution 

Fix  positive  integers  m  and  n ,  and  let 
Ajfi  G  F  for  j  =  l, ...  ,m  and 
k  =  1 , . . . ,  n .  Consider  the  homoge¬ 
neous  system  of  linear  equations 

n 

^  ]  ^1  ,kxk  ~  ^ 
k  =  1 


Homogeneous,  in  this  context, 
means  that  the  constant  term  on  the 
right  side  of  each  equation  below 
is  0. 

^  '  ^m,k^k  0- 
k—  1 


Obviously  xi  =  •  •  •  =  xn  =  0  is  a  solution  of  the  system  of  equations  above; 
the  question  here  is  whether  any  other  solutions  exist. 

Define  T :  Fn  ->  Fm  by 


n  n 

)  ^  ^1  ,k%k’  •  •  •  »  ^  ^  ^m,k^k)  • 

k  —  1  k  —  1 


The  equation  T(xi , . . . ,  xw)  =  0  (the  0  here  is  the  additive  identity  in  Fm, 
namely,  the  list  of  length  m  of  all  0’s)  is  the  same  as  the  homogeneous  system 
of  linear  equations  above. 

Thus  we  want  to  know  if  null  T  is  strictly  bigger  than  {0}.  In  other  words, 
we  can  rephrase  our  question  about  nonzero  solutions  as  follows  (by  3.16): 
What  condition  ensures  that  T  is  not  injective? 


3.26  Homogeneous  system  of  linear  equations 

A  homogeneous  system  of  linear  equations  with  more  variables  than 
equations  has  nonzero  solutions. 

Proof  Use  the  notation  and  result  from  the  example  above.  Thus  T  is  a 
linear  map  from  Fw  to  Fm,  and  we  have  a  homogeneous  system  of  m  linear 
equations  with  n  variables  x\, ...  ,xn.  From  3.23  we  see  that  T  is  not  injective 
if  n  >  m.  m 

Example  of  the  result  above:  a  homogeneous  system  of  four  linear  equa¬ 
tions  with  five  variables  has  nonzero  solutions. 
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3.27  Example  Consider  the  question  of  whether  an  inhomogeneous  sys¬ 
tem  of  linear  equations  has  no  solutions  for  some  choice  of  the  constant  terms. 
Rephrase  this  question  in  terms  of  a  linear  map. 


Solution  Fix  positive  integers  m  and  n ,  and  let  Aj  ^  e  F  for  j  =  1, . . . ,  m 
and  k  —  1 , . . . ,  n.  For  c\, ...  ,cm  e  F,  consider  the  system  of  linear  equations 


3.28 


n 

^  ]  ^1  ,k%k  —  ci 
k—  1 


n 


^  '  Am,kxk 


k—  1 


The  question  here  is  whether  there  is  some  choice  of  c\ , . . . ,  cm  gF  such  that 
no  solution  exists  to  the  system  above. 

Define  T :  Fn  ->  Fm  by 

n  n 

T  (v  1  ?  •  •  •  ?  V/2 )  ^  y  ^  A  i  k  %k  ?  •  •  •  ?  ^  ^  ,k  ^k )  • 

k  —  1  k  —  1 

The  equation  r(xi , . . . ,  xn)  =  (ci , . . . ,  cm)  is  the  same  as  the  system  of  equa¬ 
tions  3.28.  Thus  we  want  to  know  if  range  T  ^  Fm.  Hence  we  can  rephrase 
our  question  about  not  having  a  solution  for  some  choice  of  ci , . . . ,  cm  gF 
as  follows:  What  condition  ensures  that  T  is  not  surjective? 


3.29  Inhomogeneous  system  of  linear  equations 

An  inhomogeneous  system  of  linear  equations  with  more  equations  than 
variables  has  no  solution  for  some  choice  of  the  constant  terms. 

Proof  Use  the  notation  and  result  from 
the  example  above.  Thus  T  is  a  lin¬ 
ear  map  from  Fn  to  Fm,  and  we  have  a 
system  of  m  equations  with  n  variables 
x\ , . . . ,  xn .  From  3.24  we  see  that  T  is 
not  surjective  if  n  <  m.  m 

Example  of  the  result  above:  an 
inhomogeneous  system  of  five  linear 
equations  with  four  variables  has  no  solution  for  some  choice  of  the  con¬ 
stant  terms. 


Our  results  about  homogeneous 
systems  with  more  variables  than 
equations  and  inhomogeneous  sys¬ 
tems  with  more  equations  than  vari¬ 
ables  (3.26  and  3.29)  are  often 
proved  using  Gaussian  elimination. 
The  abstract  approach  taken  here 
leads  to  cleaner  proofs. 
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EXERCISES  3.B 

1  Give  an  example  of  a  linear  map  T  such  that  dim  null  T  =  3  and 
dim  range  T  —  2. 

2  Suppose  F  is  a  vector  space  and  S,T  e  C(V,  V )  are  such  that 

range  N  C  null  T. 


Prove  that  (ST)2  =  0. 

3  Suppose  vi , . . . ,  vm  is  a  list  of  vectors  in  F  Define  T  e  £(Fm,  F)  by 

T (z\ , . . . ,  zm)  =  zivi  +  •  •  •  +  zm vm . 

(a)  What  property  of  T  corresponds  to  v\ , . . . ,  vm  spanning  F  ? 

(b)  What  property  of  T  corresponds  to  v\, ...  ,vm  being  linearly 
independent? 

4  Show  that 

{T  e  C( R5,R4)  :  dimnullT  >  2} 
is  not  a  subspace  of  £(R5,  R4). 

5  Give  an  example  of  a  linear  map  T  :  R4  — ►  R4  such  that 

range  T  =  null  T. 

6  Prove  that  there  does  not  exist  a  linear  map  T  :  R5  — >  R5  such  that 

range  T  =  null  T. 

7  Suppose  F  and  IF  are  finite-dimensional  with  2  <  dim  F  <  dim  W. 
Show  that  {T  e  C(V,  W )  :  T  is  not  injective}  is  not  a  subspace  of 
C(V,  W). 

8  Suppose  F  and  IF  are  finite-dimensional  with  dim  F  >  dim  IF  >  2. 
Show  that  {T  e  C(V ,  IF)  :  T  is  not  surjective}  is  not  a  subspace  of 
C(V,  W). 

9  Suppose  T  e  C(V ,  IF)  is  injective  and  vi , . . . ,  vn  is  linearly  independent 
in  F  Prove  that  Tv\ , . . . ,  Tvn  is  linearly  independent  in  IF. 
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10  Suppose  vi, . . . ,  vn  spans  V  and  T  g  C(V,  W).  Prove  that  the  list 
Tv\, . . .  ,Tvn  spans  range  T. 

11  Suppose  S\,...,Sn  are  injective  linear  maps  such  that  S1S2  •  •  •  Sn 
makes  sense.  Prove  that  S1S2  •  •  •  Sn  is  injective. 

12  Suppose  that  V  is  finite-dimensional  and  that  T  G  C{V,  W).  Prove 
that  there  exists  a  subspace  U  of  V  such  that  U  Pi  null  T  =  {0}  and 
ranged  =  {Tu  :  u  G  U}. 

13  Suppose  T  is  a  linear  map  from  F4  to  F2  such  that 


null  7"  =  {(xi,  X2,  X3,  X4)  G  F4  :  x\  —  5x2  andx3  =  7x4}. 

Prove  that  T  is  surjective. 

14  Suppose  U  is  a  3-dimensional  subspace  of  R8  and  that  T  is  a  linear  map 
from  R8  to  R5  such  that  null  T  —  U.  Prove  that  T  is  surjective. 

15  Prove  that  there  does  not  exist  a  linear  map  from  F5  to  F2  whose  null 
space  equals 

{(xi,  X2,  X3,  X4,  X5)  G  F5  :  xi  =  3x2  and  X3  =  X4  =  X5}. 

16  Suppose  there  exists  a  linear  map  on  V  whose  null  space  and  range  are 
both  finite-dimensional.  Prove  that  V  is  finite-dimensional. 

17  Suppose  V  and  W  are  both  finite-dimensional.  Prove  that  there  exists  an 
injective  linear  map  from  V  to  W  if  and  only  if  dim  V  <  dim  W. 

18  Suppose  V  and  W  are  both  finite-dimensional.  Prove  that  there  exists  a 
surjective  linear  map  from  V  onto  W  if  and  only  if  dim  V  >  dim  W. 

19  Suppose  V  and  W  are  finite-dimensional  and  that  U  is  a  subspace  of  V. 
Prove  that  there  exists  T  G  C{V,  W)  such  that  null  T  =  U  if  and  only  if 
dim  U  >  dim  V  —  dim  W. 

20  Suppose  W  is  finite-dimensional  and  T  G  C(V,  W).  Prove  that  T  is 
injective  if  and  only  if  there  exists  S  G  C(W,  V )  such  that  ST  is  the 
identity  map  on  V. 

21  Suppose  V  is  finite-dimensional  and  T  G  C{V,  W).  Prove  that  T  is 
surjective  if  and  only  if  there  exists  S  G  C(W,  V )  such  that  TS  is  the 
identity  map  on  W. 
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22  Suppose  U  and  V  are  finite-dimensional  vector  spaces  and  S  e  C(V,  VP ) 
and  T  e  C(U,V).  Prove  that 

dim  null  ST  <  dim  null  S  +  dim  null  T. 

23  Suppose  U  and  V  are  finite-dimensional  vector  spaces  and  S  e  £(F,  VP) 
and  T  e  C(U,V).  Prove  that 

dim  range  ST  <  minjdim  range  S ,  dim  range  T } . 

24  Suppose  VP  is  finite-dimensional  and  T\ ,  T2  e  C{V,  VP).  Prove  that 
null  T\  C  null  T2  if  and  only  if  there  exists  S  e  C(W,  VP)  such  that 
T2  =  ST\. 

25  Suppose  V  is  finite-dimensional  and  T\,T2  €  £(F,  VP).  Prove  that 
range  7\  C  range  T2  if  and  only  if  there  exists  S  e  £(F,  V)  such  that 
Ti  =  T2S. 

26  Suppose  D  e  £("P(R),  P(R))  is  such  that  deg  Dp  =  (deg  /?)  —  1  for 
every  nonconstant  polynomial  p  e  P(R).  Prove  that  D  is  surjective, 
[rte  notation  D  is  used  above  to  remind  you  of  the  differentiation  map 
that  sends  a  polynomial  p  to  p'.  Without  knowing  the  formula  for  the 
derivative  of  a  polynomial  ( except  that  it  reduces  the  degree  by  1 ),  you 
can  use  the  exercise  above  to  show  that  for  every  polynomial  q  e  V(R), 
there  exists  a  polynomial  p  e  V(R)  such  that  pr  =  q.] 

27  Suppose  p  e  V(R).  Prove  that  there  exists  a  polynomial  q  e  V(R)  such 
that  5q"  +  3 q’  —  p. 

[ This  exercise  can  be  done  without  linear  algebra,  but  it’s  more  fun  to  do 
it  using  linear  algebra .] 

28  Suppose  T  e  C(V,  VP),  and  w  1 , . . . ,  wm  is  a  basis  of  range  T.  Prove  that 
there  exist  <p\ , . . . ,  (pm  e  C(V,  F)  such  that 

Tv  =  cpi(v)wi  H - h  ^m(v)wm 

for  every  v  e  V. 

29  Suppose  cp  e  C(V,  F).  Suppose  u  e  V  is  not  in  null<p.  Prove  that 

V  =  null  cp  ®  {au  :  a  e  F}. 

30  Suppose  cpi  and  cp2  are  linear  maps  from  V  to  F  that  have  the  same  null 
space.  Show  that  there  exists  a  constant  c  e  F  such  that  cp\  —  ccp2. 

31  Give  an  example  of  two  linear  maps  T\  and  T2  from  R5  to  R2  that  have 
the  same  null  space  but  are  such  that  T\  is  not  a  scalar  multiple  of  T2. 
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3.C  Matrices 


Representing  a  Linear  Map  by  a  Matrix 

We  know  that  if  vi , . . . ,  vn  is  a  basis  of  V  and  T :  V  ->  fL  is  linear,  then  the 
values  of  Tvi , . . . ,  Tvn  determine  the  values  of  T  on  arbitrary  vectors  in  V 
(see  3.5).  As  we  will  soon  see,  matrices  are  used  as  an  efficient  method  of 
recording  the  values  of  the  Tv  f s  in  terms  of  a  basis  of  W. 

3.30  Definition  matrix ,  Aj ^ 

Let  m  and  n  denote  positive  integers.  An  m-by-n  matrix  A  is  a  rectangular 
array  of  elements  of  F  with  m  rows  and  n  columns: 


(  A\,\  ...  A\^n  \ 


A  = 


The  notation  Aj^  denotes  the  entry  in  row  j ,  column  k  of  A.  In  other 
words,  the  first  index  refers  to  the  row  number  and  the  second  index  refers 
to  the  column  number. 

Thus  ^2,3  refers  to  the  entry  in  the  second  row,  third  column  of  a  matrix  A. 


3.31  Example 


Now  we  come  to  the  key  definition  in  this  section. 

3.32  Definition  matrix  of  a  linear  map,  M(T ) 

Suppose  T  G  C(V,  W)  and  v\ , . . . ,  vn  is  a  basis  of  V  and  w i , . . . ,  wm  is 
a  basis  of  W.  The  matrix  of  T  with  respect  to  these  bases  is  the  m-by-n 
matrix  M(T )  whose  entries  Aj^  are  defined  by 

Tvk  =  A1?kwi  H - b  Am^kwm. 

If  the  bases  are  not  clear  from  the  context,  then  the  notation 
M  (T,  (vi  , . . . ,  v„),  (wi , . . . ,  wm))  is  used. 

The  matrix  M(T)  of  a  linear  map  T  G  C(V,  W)  depends  on  the  basis 
vi , . . . ,  vn  of  V  and  the  basis  w i , . . . ,  wm  of  W,  as  well  as  on  T.  However,  the 
bases  should  be  clear  from  the  context,  and  thus  they  are  often  not  included  in 
the  notation. 
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To  remember  how  A4(T)  is  constructed  from  T,  you  might  write  across 
the  top  of  the  matrix  the  basis  vectors  v\ , . . . ,  vn  for  the  domain  and  along  the 
left  the  basis  vectors  w\ , . . . ,  wm  for  the  vector  space  into  which  T  maps,  as 
follows: 


In  the  matrix  above  only  the  kth  col¬ 
umn  is  shown.  Thus  the  second  index 
of  each  displayed  entry  of  the  matrix 
above  is  k.  The  picture  above  should 
remind  you  that  T vp  can  be  computed 
from  M(T)  by  multiplying  each  entry 
in  the  kih  column  by  the  correspond¬ 
ing  Wj  from  the  left  column,  and  then 
adding  up  the  resulting  vectors. 

If  T  is  a  linear  map  from  Fw  to  Fm, 
then  unless  stated  otherwise,  assume  the 
bases  in  question  are  the  standard  ones 
(where  the  kth  basis  vector  is  1  in  the 
kth  slot  and  0  in  all  the  other  slots).  If 
you  think  of  elements  of  Fm  as  columns 
of  m  numbers,  then  you  can  think  of  the 
kth  column  of  M(T)  as  T  applied  to 
the  kth  standard  basis  vector. 


The  kth  column  of  Ai(T)  con¬ 
sists  of  the  scalars  needed  to  write 
Tvk  as  a  linear  combination  of 
(wi, . . . ,  wm): 

m 

Tvk  =  ^  Aj\kwj. 

7  =  1 


If  T  maps  an  n -dimensional  vector 
space  to  an  m-dimensional  vector 
space,  then  A4(T)  is  an  m-by-n 
matrix. 

3.33  Example  Suppose  T  e  £(F2,F3)  is  defined  by 

T(x,  y)  =  (x  +  3 y,  2x  +  5y,  lx  +  9 y). 

Find  the  matrix  of  T  with  respect  to  the  standard  bases  of  F2  and  F3. 

Solution  Because  T(l,  0)  =  (1,2,  7)  and  7\0,  1)  =  (3,  5,  9),  the  matrix  of 
T  with  respect  to  the  standard  bases  is  the  3-by-2  matrix  below: 

(x  3) 

M(T )  =25. 

V  7  9/ 
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When  working  with  Vm (F),  use  the  standard  basis  1 ,  x,  x2, . . . ,  xm  unless 
the  context  indicates  otherwise. 


3.34  Example  Suppose  D  e  R),  V2(K))  is  the  differentiation  map 
defined  by  Dp  —  p'.  Find  the  matrix  of  D  with  respect  to  the  standard  bases 
of  V3(R)  and  ^(R)- 

Solution  Because  ( xn)f  =  nxn~l ,  the  matrix  of  T  with  respect  to  the 
standard  bases  is  the  3-by-4  matrix  below: 

/  0  1  0  0  \ 

M(D)  =  0  0  2  0. 

\  0  0  0  3  / 


Addition  and  Scalar  Multiplication  of  Matrices 

For  the  rest  of  this  section,  assume  that  V  and  W  are  finite-dimensional  and 
that  a  basis  has  been  chosen  for  each  of  these  vector  spaces.  Thus  for  each 
linear  map  from  V  to  W,  we  can  talk  about  its  matrix  (with  respect  to  the 
chosen  bases,  of  course).  Is  the  matrix  of  the  sum  of  two  linear  maps  equal  to 
the  sum  of  the  matrices  of  the  two  maps? 

Right  now  this  question  does  not  make  sense,  because  although  we  have 
defined  the  sum  of  two  linear  maps,  we  have  not  defined  the  sum  of  two 
matrices.  Fortunately,  the  obvious  definition  of  the  sum  of  two  matrices  has 
the  right  properties.  Specifically,  we  make  the  following  definition. 


3.35  Definition  matrix  addition 

The  sum  of  two  matrices  of  the  same  size  is  the  matrix  obtained  by  adding 
corresponding  entries  in  the  matrices: 


^i.i 


A  i  w  ^ 


/  Ci! 


+ 


\  Am,l  •  •  •  Am,n  / 


Cl^n  \ 


(  M  A  +  Ci  i 


\  Am,i  +  Cmp 
In  other  words,  (A  +  C)  j,k  =  A +  Cjik. 


\  Cm^\  .  .  .  Cm,n  / 


Al  w  +  C\ M  ^ 


Am,n  H-  Cm^n  J 
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In  the  following  result,  the  assumption  is  that  the  same  bases  are  used  for 
all  three  linear  maps  S  +  T,  S,  and  T. 

3.36  The  matrix  of  the  sum  of  linear  maps 

Suppose  SJg  £(V,  W).  Then  M(S  +  T)  =  M(S)  +  M(T). 

The  verification  of  the  result  above  is  left  to  the  reader. 

Still  assuming  that  we  have  some  bases  in  mind,  is  the  matrix  of  a  scalar 
times  a  linear  map  equal  to  the  scalar  times  the  matrix  of  the  linear  map? 
Again  the  question  does  not  make  sense,  because  we  have  not  defined  scalar 
multiplication  on  matrices.  Fortunately,  the  obvious  definition  again  has  the 
right  properties. 


3.37  Definition  scalar  multiplication  of  a  matrix 


The  product  of  a  scalar  and  a  matrix  is  the  matrix  obtained  by  multiplying 
each  entry  in  the  matrix  by  the  scalar: 


/  At 


•  •  • 


A 


A  \,n  ^ 


A  Ax. 


•  •  • 


\  . . .  Am,n  J 

In  other  words,  (XA)Jjc  =  XAj^. 


XA\„  \ 


\  . . .  XAm^n  J 


In  the  following  result,  the  assumption  is  that  the  same  bases  are  used  for 
both  linear  maps  XT  and  T. 

3.38  The  matrix  of  a  scalar  times  a  linear  map 
Suppose  A  g  F  and  T  e  C(V,  W).  Then  M(XT)  =  XM(T). 

The  verification  of  the  result  above  is  also  left  to  the  reader. 

Because  addition  and  scalar  multiplication  have  now  been  defined  for 
matrices,  you  should  not  be  surprised  that  a  vector  space  is  about  to  appear. 
We  need  only  a  bit  of  notation  so  that  this  new  vector  space  has  a  name. 


3.39  Notation 

For  m  and  n  positive  integers,  the  set  of  all  m-by-n  matrices  with  entries 
in  F  is  denoted  by 
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3.40  dimFm,w  =  mn 

Suppose  m  and  n  are  positive  integers.  With  addition  and  scalar  multipli¬ 
cation  defined  as  above,  Fm,n  is  a  vector  space  with  dimension  mn. 

Proof  The  verification  that  Fm,n  is  a  vector  space  is  left  to  the  reader.  Note 
that  the  additive  identity  of  Fm,n  is  the  m-by-n  matrix  whose  entries  all 
equal  0. 

The  reader  should  also  verify  that  the  list  of  m-by-n  matrices  that  have  0 
in  all  entries  except  for  a  1  in  one  entry  is  a  basis  of  Fm,n.  There  are  mn  such 
matrices,  so  the  dimension  of  Fm,n  equals  mn.  m 

Matrix  Multiplication 

Suppose,  as  previously,  that  vi, . . . ,  vn  is  a  basis  of  V  and  w i, . . . ,  wm  is 
a  basis  of  W.  Suppose  also  that  we  have  another  vector  space  U  and  that 
u  i , . . . ,  Up  is  a  basis  of  U. 

Consider  linear  maps  T :  U  ->  V  and  S :  V  ->  W.  The  composition 
ST  is  a  linear  map  from  U  to  W.  Does  M(ST )  equal  M(S)M(T)1  This 
question  does  not  yet  make  sense,  because  we  have  not  defined  the  product  of 
two  matrices.  We  will  choose  a  definition  of  matrix  multiplication  that  forces 
this  question  to  have  a  positive  answer.  Let’s  see  how  to  do  this. 

Suppose  Ad  (S')  =  A  and  A4(T)  =  C.  For  1  <  k  <  p,  we  have 

n 

(ST)uk  =  S(YJCr,kvr) 

r—  1 
n 

Cy,k  SVy 

r—  1 

n  m 

=  Th  ^r,k  Ti  Aj,rWj 

r= 1  7  =  1 

m  n 

=  Ti  (XI  Aj,i-Cr,k)wj  ■ 

7  =  1  '■  =  1 

Thus  M(ST)  is  the  in -by -p  matrix  whose  entry  in  row  j ,  column  k,  equals 

n 

,k- 

r—  1 


E 


Aj,,C 
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Now  we  see  how  to  define  matrix  multiplication  so  that  the  desired  equation 
M(ST)  =  M(S)M(T)  holds. 


3.41  Definition  matrix  multiplication 

Suppose  A  is  an  m-by-n  matrix  and  C  is  an  n-by-p  matrix.  Then  AC  is 
defined  to  be  the  m- by- p  matrix  whose  entry  in  row  j ,  column  k ,  is  given 
by  the  following  equation: 

n 

(AC)Lk  =  J2Aj,rCr,k. 

r—  1 

In  other  words,  the  entry  in  row  /,  column  k,  of  AC  is  computed  by 
taking  row  j  of  A  and  column  k  of  C ,  multiplying  together  corresponding 
entries,  and  then  summing. 


Note  that  we  define  the  product  of 
two  matrices  only  when  the  number  of 
columns  of  the  first  matrix  equals  the 
number  of  rows  of  the  second  matrix. 


You  may  have  learned  this  defini¬ 
tion  of  matrix  multiplication  in  an 
earlier  course,  although  you  may 
not  have  seen  the  motivation  for  it. 


3.42  Example  Here  we  multiply  together  a  3-by-2  matrix  and  a  2-by-4 
matrix,  obtaining  a  3-by-4  matrix: 


1  2 
3  4 
5  6 


6  5  4  3 

2  10-1 


10  7  4  1  \ 

26  19  12  5  . 

42  31  20  9  / 


Matrix  multiplication  is  not  commutative.  In  other  words,  AC  is  not 
necessarily  equal  to  CA  even  if  both  products  are  defined  (see  Exercise  12). 
Matrix  multiplication  is  distributive  and  associative  (see  Exercises  13  and  14). 

In  the  following  result,  the  assumption  is  that  the  same  basis  of  V  is  used 
in  considering  T  e  C(U,  V )  and  S  e  C(V,  W),  the  same  basis  of  W  is  used 
in  considering  S  e  C(V,  W)  and  ST  e  C{U ,  W),  and  the  same  basis  of  U  is 
used  in  considering  T  e  jC(U ,  V)  and  ST  e  C{U ,  W). 


3.43  The  matrix  of  the  product  of  linear  maps 

If  T  g  £([/,  V)  and  5  e  £(7,  W),  then  M(ST)  =  M(S)M(T). 


The  proof  of  the  result  above  is  the  calculation  that  was  done  as  motivation 
before  the  definition  of  matrix  multiplication. 
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In  the  next  piece  of  notation,  note  that  as  usual  the  first  index  refers  to  a 
row  and  the  second  index  refers  to  a  column,  with  a  vertically  centered  dot 
used  as  a  placeholder. 

3.44  Notation  Aj, ,  A.ik 
Suppose  A  is  an  m-by-n  matrix. 

•  If  1  <  j  <  m,  then  Aj ,  denotes  the  1-by -n  matrix  consisting  of 
row  j  of  A. 

•  If  1  <  k  <  n,  then  A.  k  denotes  the  m-by-1  matrix  consisting  of 
column  k  of  A. 


3.45  Example  If  A  =  ^  ^ 

column  2  of  A.  In  other  words, 

A2,  =  (  1  9 

The  product  of  a  1-by -n  matrix  and  an  n- by-1  matrix  is  a  1-by-l  matrix. 
However,  we  will  frequently  identify  a  1-by-l  matrix  with  its  entry. 

3.46  Example  (3  4  )  ^  2  ^  =  (  26  )  because  3  •  6  +  4  •  2  =  26. 
However,  we  can  identify  (  26  )  with  26,  writing  (3  4)^^^=26. 

Our  next  result  gives  another  way  to  think  of  matrix  multiplication:  the 
entry  in  row  /,  column  k ,  of  AC  equals  (row  j  of  A)  times  (column  k  of  C). 

3.47  Entry  of  matrix  product  equals  row  times  column 
Suppose  A  is  an  m-by-n  matrix  and  C  is  an  n-by-p  matrix.  Then 

04 C)M  =  Aj,  Ck 

for  1  <  j  <  m  and  1  <  k  <  p. 

The  proof  of  the  result  above  follows  immediately  from  the  definitions. 

3.48  Example  The  result  above  and  Example  3.46  show  why  the  entry 
in  row  2,  column  1,  of  the  product  in  Example  3.42  equals  26. 


4  5 
9  7 


,  then  A2  •  is  row  2  of  A  and  A.  2  is 


7  )  and  A  2  =  (  9  )  • 
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The  next  result  gives  yet  another  way  to  think  of  matrix  multiplication.  It 
states  that  column  k  of  AC  equals  A  times  column  k  of  C. 

3.49  Column  of  matrix  product  equals  matrix  times  column 

Suppose  A  is  an  m-by-n  matrix  and  C  is  an  n-by-p  matrix.  Then 

(AC\k  =  AC.# 

for  1  <  k  <  p. 

Again,  the  proof  of  the  result  above  follows  immediately  from  the  defini¬ 
tions  and  is  left  to  the  reader. 


3.50  Example  From  the  result  above  and  the  equation 


we  see  why  column  2  in  the  matrix  product  in  Example  3.42  is  the  right  side 
of  the  equation  above. 


We  give  one  more  way  of  thinking  about  the  product  of  an  m-by-n  matrix 
and  an  n- by -l  matrix.  The  following  example  illustrates  this  approach. 


3.51  Example  In  the  example  above,  the  product  of  a  3-by-2  matrix  and 
a  2-by-l  matrix  is  a  linear  combination  of  the  columns  of  the  3-by-2  matrix, 
with  the  scalars  that  multiply  the  columns  coming  from  the  2-by-l  matrix. 
Specifically, 


The  next  result  generalizes  the  example  above.  Again,  the  proof  follows 
easily  from  the  definitions  and  is  left  to  the  reader. 


3.52  Linear  combination  of  columns 
Suppose  A  is  an  m-by-n  matrix  and  c  = 

Then 


/  Cl  \ 

is  an  n- by-1  matrix. 

\cn  j 


Ac  —  c\A.  \  +  •  •  •  +  cnA.^n. 


In  other  words,  Ac  is  a  linear  combination  of  the  columns  of  A,  with  the 
scalars  that  multiply  the  columns  coming  from  c. 
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Two  more  ways  to  think  about  matrix  multiplication  are  given  by  Exercises 


10  and  11. 


EXERCISES  3.C 


1  Suppose  V  and  W  are  finite-dimensional  and  T  g  C(V,  W).  Show  that 
with  respect  to  each  choice  of  bases  of  V  and  W,  the  matrix  of  T  has  at 
least  dim  rangeT  nonzero  entries. 

2  Suppose  D  e  C( V3(R),V2(R))  is  the  differentiation  map  defined  by 
Dp  =  pr.  Find  a  basis  of  V3  (R)  and  a  basis  of  V2  (R)  such  that  the 
matrix  of  D  with  respect  to  these  bases  is 


[< Compare  the  exercise  above  to  Example  3.34. 

The  next  exercise  generalizes  the  exercise  above.] 

3  Suppose  V  and  W  are  finite-dimensional  and  T  G  C(V,  W).  Prove 
that  there  exist  a  basis  of  V  and  a  basis  of  W  such  that  with  respect  to 
these  bases,  all  entries  of  M(T)  are  0  except  that  the  entries  in  row  j , 
column  j ,  equal  1  for  1  <j  <  dim  range  T. 

4  Suppose  vi , . . . ,  vm  is  a  basis  of  V  and  W  is  finite-dimensional.  Suppose 
T  G  C(V,  W).  Prove  that  there  exists  a  basis  w\, . . .  ,wn  of  W  such  that 
all  the  entries  in  the  first  column  of  M(T )  (with  respect  to  the  bases 
v\ , . . . ,  vm  and  w\ , . . . ,  wn)  are  0  except  for  possibly  a  1  in  the  first  row, 
first  column. 

[In  this  exercise ,  unlike  Exercise  3 ,  you  are  given  the  basis  of  V  instead 
of  being  able  to  choose  a  basis  of  V .] 

5  Suppose  w\ , . . . ,  wn  is  a  basis  of  W  and  V  is  finite-dimensional.  Suppose 
T  G  C(V,  W).  Prove  that  there  exists  a  basis  v\, ...  ,vm  ofV  such 
that  all  the  entries  in  the  first  row  of  M(T)  (with  respect  to  the  bases 
v\, ...  ,vm  and  w\ , . . . ,  wn)  are  0  except  for  possibly  a  1  in  the  first  row, 
first  column. 

[In  this  exercise,  unlike  Exercise  3,  you  are  given  the  basis  of  W  instead 
of  being  able  to  choose  a  basis  ofW.\ 
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6  Suppose  V  and  W  are  finite-dimensional  and  T  e  £(V,  W ).  Prove  that 


dim  range  T  =  1  if  and  only  if  there  exist  a  basis  of  V  and  a  basis  of  W 


such  that  with  respect  to  these  bases,  all  entries  of  M(T)  equal  1. 

7  Verify  3.36. 

8  Verify  3.38. 

9  Prove  3.52. 

10  Suppose  A  is  an  m-by-n  matrix  and  C  is  an  n-by-p  matrix.  Prove  that 


for  1  <  j  <  m.  In  other  words,  show  that  row  j  of  AC  equals 
(row  j  of  A)  times  C. 


11  Suppose  a  —  (  a\  •  •  •  an  )  is  a  1-by -n  matrix  and  C  is  an  n-by-p 
matrix.  Prove  that 

aC  —  a \C i +  •••  +  cinCn ?.  . 


In  other  words,  show  that  a  C  is  a  linear  combination  of  the  rows  of  C, 
with  the  scalars  that  multiply  the  rows  coming  from  a. 

12  Give  an  example  with  2-by-2  matrices  to  show  that  matrix  multiplication 
is  not  commutative.  In  other  words,  find  2-by-2  matrices  A  and  C  such 
that  ^4C  CA. 

13  Prove  that  the  distributive  property  holds  for  matrix  addition  and  matrix 
multiplication.  In  other  words,  suppose  A,  B ,  C,  D ,  E ,  and  F  are 
matrices  whose  sizes  are  such  that  A(B  +  C)  and  (D  +  E)F  make 
sense.  Prove  that  AB  +  AC  and  DF  +  EF  both  make  sense  and  that 
A{B  +  C)  =  AB  +  AC  and  (D  +  E)F  =  DF  +  EF. 

14  Prove  that  matrix  multiplication  is  associative.  In  other  words,  suppose 
A,  B,  and  C  are  matrices  whose  sizes  are  such  that  (AB)C  makes  sense. 
Prove  that  A(BC )  makes  sense  and  that  ( AB)C  =  A (BC). 

15  Suppose  A  is  an  n-by-n  matrix  and  1  <  j,k  <  n.  Show  that  the  entry  in 
row  j ,  column  k ,  of  A3  (which  is  defined  to  mean  ^4^4^4)  is 


n  n 
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CHAPTER  3  Linear  Maps 


Invertibility  and  Isomorphic  Vector 
Spaces 

Invertible  Linear  Maps 

We  begin  this  section  by  defining  the  notions  of  invertible  and  inverse  in  the 
context  of  linear  maps. 

3.53  Definition  invertible,  inverse 

•  A  linear  map  T  g  C(V,  W)  is  called  invertible  if  there  exists  a 
linear  map  S  G  C(W,  V )  such  that  ST  equals  the  identity  map  on 
V  and  T S  equals  the  identity  map  on  W. 

•  A  linear  map  S  G  C(W ,  V)  satisfying  ST  =  I  and  TS  =  I  is 
called  an  inverse  of  T  (note  that  the  first  I  is  the  identity  map  on  V 
and  the  second  I  is  the  identity  map  on  W). 

3.54  Inverse  is  unique 

An  invertible  linear  map  has  a  unique  inverse. 

Proof  Suppose  T  G  C(V,  W)  is  invertible  and  Si  and  S2  are  inverses  of  T. 
Then 

Si  =  Sil  =  Si(TS2)  =  (Si  T)S2  =  IS2  =  S2. 

Thus  Si  =  S2-  ■ 

Now  that  we  know  that  the  inverse  is  unique,  we  can  give  it  a  notation. 

3.55  Notation  T~l 

If  T  is  invertible,  then  its  inverse  is  denoted  by  T~l .  In  other  words,  if 
T  G  C(V,  W)  is  invertible,  then  T~l  is  the  unique  element  of  C(W,  V ) 
such  that  T~lT  =  I  and  TT~l  =  I. 

The  following  result  characterizes  the  invertible  linear  maps. 

3.56  Invertibility  is  equivalent  to  injectivity  and  surjectivity 

A  linear  map  is  invertible  if  and  only  if  it  is  injective  and  surjective. 


3.D 
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Proof  Suppose  T  e  C(V,  W).  We  need  to  show  that  T  is  invertible  if  and 
only  if  it  is  injective  and  surjective. 

First  suppose  T  is  invertible.  To  show  that  T  is  injective,  suppose  u,  v  e  V 
and  Tu  —  Tv.  Then 

U  =  T~l{T  u)  =  T~l{Tv)  =  v, 
so  u  =  v.  Hence  T  is  injective. 

We  are  still  assuming  that  T  is  invertible.  Now  we  want  to  prove  that  T  is 
surjective.  To  do  this,  let  w  e  W.  Then  w  =  T(T_1w),  which  shows  that  w  is 
in  the  range  of  T.  Thus  range  T  =  W.  Hence  T  is  surjective,  completing  this 
direction  of  the  proof. 

Now  suppose  T  is  injective  and  surjective.  We  want  to  prove  that  T  is 
invertible.  For  each  w  e  W,  define  S w  to  be  the  unique  element  of  V  such 
that  T(Sw)  =  w  (the  existence  and  uniqueness  of  such  an  element  follow 
from  the  surjectivity  and  injectivity  of  T).  Clearly  T  o  S  equals  the  identity 
map  on  W. 

To  prove  that  S  o  T  equals  the  identity  map  on  V,  let  v  e  V.  Then 

T((S  o  7>)  =  (T  o  S)(T v)  =  I(Tv)  =  Tv. 

This  equation  implies  that  ( S  o  T)v  =  v  (because  T  is  injective).  Thus  S  o  T 
equals  the  identity  map  on  V. 

To  complete  the  proof,  we  need  to  show  that  S  is  linear.  To  do  this,  suppose 
wi,  W2  G  W.  Then 

T  ( S  w’  i  +  S  W2)  =  T  ( S  w’  1 )  +  T  ( S  w’2 )  —  w\  +  W2. 

Thus  Swi  +  Sw2  is  the  unique  element  of  V  that  T  maps  to  w\  +  W2.  By 
the  definition  of  S ,  this  implies  that  £(wi  +  W2)  =  Sw  1  +  Sw2-  Hence  S 
satisfies  the  additive  property  required  for  linearity. 

The  proof  of  homogeneity  is  similar.  Specifically,  if  w  e  W  and  X  e  F, 
then 

T(XSw)  =  XT(Sw)  =  Xw. 

Thus  XSw  is  the  unique  element  of  V  that  T  maps  to  Xw.  By  the  definition  of 
S ,  this  implies  that  £(Aw)  =  XSw.  Hence  S  is  linear,  as  desired.  ■ 


3.57  Example  linear  maps  that  are  not  invertible 

•  The  multiplication  by  x2  linear  map  from  7^(R)  to  V(R)  (see  3.4)  is 
not  invertible  because  it  is  not  surjective  (1  is  not  in  the  range). 

•  The  backward  shift  linear  map  from  F°°  to  F°°  (see  3.4)  is  not  invertible 
because  it  is  not  injective  [(1, 0,  0,  0, . . . )  is  in  the  null  space]. 
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Isomorphic  Vector  Spaces 

The  next  definition  captures  the  idea  of  two  vector  spaces  that  are  essentially 
the  same,  except  for  the  names  of  the  elements  of  the  vector  spaces. 

3.58  Definition  isomorphism ,  isomorphic 

•  An  isomorphism  is  an  invertible  linear  map. 

•  Two  vector  spaces  are  called  isomorphic  if  there  is  an  isomorphism 
from  one  vector  space  onto  the  other  one. 

Think  of  an  isomorphism  T :  V  — >  W  as  relabeling  v  e  V  as  Tv  e  W.  This 
viewpoint  explains  why  two  isomorphic  vector  spaces  have  the  same  vector 
space  properties.  The  terms  “isomorphism”  and  “invertible  linear  map”  mean 

the  same  thing.  Use  “isomorphism" 
when  you  want  to  emphasize  that  the 
two  spaces  are  essentially  the  same. 


The  Greek  word  isos  means  equal; 
the  Greek  word  morph  means 
shape.  Thus  isomorphic  literally 
means  equal  shape. 


3.59  Dimension  shows  whether  vector  spaces  are  isomorphic 

Two  finite-dimensional  vector  spaces  over  F  are  isomorphic  if  and  only  if 
they  have  the  same  dimension. 

Proof  First  suppose  V  and  W  are  isomorphic  finite-dimensional  vector 
spaces.  Thus  there  exists  an  isomorphism  T  from  V  onto  W.  Because  T  is 
invertible,  we  have  null  T  =  {0}  and  range  T  =  W.  Thus  dim  null  T  —  0 
and  dim  range  T  =  dim  W.  The  formula 

dim  V  =  dim  null  T  +  dim  range  T 

(the  Fundamental  Theorem  of  Linear  Maps,  which  is  3.22)  thus  becomes  the 
equation  dim  V  =  dim  W ,  completing  the  proof  in  one  direction. 

To  prove  the  other  direction,  suppose  V  and  W  are  finite-dimensional 
vector  spaces  with  the  same  dimension.  Let  v\ , . . . ,  vn  be  a  basis  of  V  and 
wi,...,wwbea  basis  of  W.  Let  T  e  C(V,  W )  be  defined  by 

T(c\v\  H - b  cnvn)  =  c\w\  H - b  cnwn. 

Then  T  is  a  well-defined  linear  map  because  vi, . . . ,  vn  is  a  basis  of  V 
(see  3.5).  Also,  T  is  surjective  because  w\, . . . ,  wn  spans  W.  Furthermore, 
null  T  =  {0}  because  w\ , . . . ,  wn  is  linearly  independent;  thus  T  is  injective. 
Because  T  is  injective  and  surjective,  it  is  an  isomorphism  (see  3.56).  Hence 
V  and  W  are  isomorphic,  as  desired.  ■ 
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The  previous  result  implies  that  each 
finite-dimensional  vector  space  V  is  iso¬ 
morphic  to  Fw ,  where  n  =  dim  V. 

If  vi, . . . ,  vn  is  a  basis  of  V  and 
w i, . . . ,  wm  is  a  basis  of  W,  then  for 
each  T  e  C(V,  W),  we  have  a  matrix 
M(T)  e  j?m,n.  In  other  words,  once 
bases  have  been  fixed  for  V  and  W, 
M  becomes  a  function  from  C(V,  W) 
to  Fm,n.  Notice  that  3.36  and  3.38  show 
that  Ad  is  a  linear  map.  This  linear  map 
is  actually  invertible,  as  we  now  show. 


Because  every  finite -dimensional 
vector  space  is  isomorphic  to  some 
Fn,  why  not  just  study  ¥n  instead  of 
more  general  vector  spaces?  To  an¬ 
swer  this  question,  note  that  an  in¬ 
vestigation  of  ¥n  would  soon  lead 
to  other  vector  spaces.  For  exam¬ 
ple,  we  would  encounter  the  null 
space  and  range  of  linear  maps.  Al¬ 
though  each  of  these  vector  spaces 
is  isomorphic  to  some  FA  thinking 
of  them  that  way  often  adds  com¬ 
plexity  but  no  new  insight. 

3.60  C(V ,  W)  and  are  isomorphic 

Suppose  vi, . . . ,  vn  is  a  basis  of  V  and  w i, . . . ,  wm  is  a  basis  of  IF. 
Then  A4  is  an  isomorphism  between  £(V,  W)  and  Fm,n. 

Proof  We  already  noted  that  M  is  linear.  We  need  to  prove  that  M  is  injec¬ 
tive  and  surjective.  Both  are  easy.  We  begin  with  injectivity.  If  T  e  C{V,  W) 
and  M(T)  =  0,  then  Tv^  =  0  for  k  =  1, . . . , n.  Because  v\, ...  ,vn  is  a 
basis  of  V,  this  implies  T  —  0.  Thus  M  is  injective  (by  3.16). 

To  prove  that  M  is  surjective,  suppose  A  e  Fm,n .  Let  T  be  the  linear  map 
from  V  to  W  such  that 

m 

Tvk  =  J2AJ*wj 
7  =  1 

for  k  =  1, . . . ,  n  (see  3.5).  Obviously  M(T )  equals  A,  and  thus  the  range  of 
M  equals  Fm,w,  as  desired.  ■ 

Now  we  can  determine  the  dimension  of  the  vector  space  of  linear  maps 
from  one  finite-dimensional  vector  space  to  another. 

3.61  dim  C(V,  W)  =  (dim  F)(dim  W) 

Suppose  V  and  W  are  finite-dimensional.  Then  C(V,  W)  is  finite¬ 
dimensional  and 


dim C(V,  W)  =  (dim  V) (dim  W). 


Proof  This  follows  from  3.60,  3.59,  and  3.40. 
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Linear  Maps  Thought  of  as  Matrix  Multiplication 

Previously  we  defined  the  matrix  of  a  linear  map.  Now  we  define  the  matrix 
of  a  vector. 

3.62  Definition  matrix  of  a  vector ;  Miy) 

Suppose  v  G  V  and  vi, . . . ,  vn  is  a  basis  of  V.  The  matrix  of  v  with 
respect  to  this  basis  is  the  n- by-1  matrix 

(  c\  \ 

M(y)  =  : 

\cn  j 

where  c\ , . . . ,  cn  are  the  scalars  such  that 

v  =  civi  H - b  cnvn. 


The  matrix  M(v)  of  a  vector  v  e  V  depends  on  the  basis  vi , . . . ,  vn  of  V, 
as  well  as  on  v.  However,  the  basis  should  be  clear  from  the  context  and  thus 
it  is  not  included  in  the  notation. 


3.63  Example  matrix  of  a  vector 

•  The  matrix  of  2  —  lx  +  5x3  with  respect  to  the  standard  basis  of  V3  (R) 
is 

l  2  \ 

-7 

0 

V  5  ) 

•  The  matrix  of  a  vector  x  e  Fn  with  respect  to  the  standard  basis  is 
obtained  by  writing  the  coordinates  of  x  as  the  entries  in  an  n- by-1 
matrix.  In  other  words,  if  x  =  (xi , . . . ,  xn )  e  Fn,  then 

/  Xl  \ 

A4(x)  =  : 

\  %n  ) 

Occasionally  we  want  to  think  of  elements  of  V  as  relabeled  to  be  n- by-1 
matrices.  Once  a  basis  v\ , . . . ,  vn  is  chosen,  the  function  M  that  takes  v  e  V 
to  M(y)  is  an  isomorphism  of  V  onto  F^’1  that  implements  this  relabeling. 
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Recall  that  if  A  is  an  m-by-n  matrix,  then  A.  ^  denotes  the  kih  column  of 
A,  thought  of  as  an  ra-by-1  matrix.  In  the  next  result,  Ad(v^)  is  computed 
with  respect  to  the  basis  w i, ,  wm  of  W. 

3.64  M(T\k  =  M(vk). 

Suppose  T  g  C(V,  W)  and  v\ , . . . ,  vn  is  a  basis  of  V  and  w  1 , . . . ,  wm  is 
a  basis  of  W.  Let  1  <  k  <  n.  Then  the  kth  column  of  A4(T),  which  is 
denoted  by  Ai(T).  equals  Ad(v^). 

Proof  The  desired  result  follows  immediately  from  the  definitions  of  M(T) 
and  M  (v^).  ■ 

The  next  result  shows  how  the  notions  of  the  matrix  of  a  linear  map,  the 
matrix  of  a  vector,  and  matrix  multiplication  fit  together. 

3.65  Linear  maps  act  like  matrix  multiplication 

Suppose  T  g  C(V,  W)  and  v  e  V.  Suppose  v\ , . . . ,  vn  is  a  basis  of  V  and 
w i , . . . ,  wm  is  a  basis  of  W.  Then 

M(Tv)  =  M(T)M(v). 


Proof  Suppose  v  =  civi  +  •  •  •  +  cnvn,  where  c\, . . . ,  cn  G  F.  Thus 
3.66  Tv  =  c\ Tv\  +  •  •  •  +  cn  Tvn . 

Hence 


M(Tv)  =  ciM(Tv{)  H - b  cnM(Tvn) 

=  ci  Ad(T).?i  +  •  •  •  +  cnA4(T)^n 
=  M(T)M(v), 

where  the  first  equality  follows  from  3.66  and  the  linearity  of  J\A,  the  second 
equality  comes  from  3.64,  and  the  last  equality  comes  from  3.52.  ■ 

Each  m-by-n  matrix  A  induces  a  linear  map  from  F72’1  to  Fm?1 ,  namely  the 
matrix  multiplication  function  that  takes  x  G  F n,x  to  Ax  G  F^’1.  The  result 
above  can  be  used  to  think  of  every  linear  map  (from  one  finite-dimensional 
vector  space  to  another  finite-dimensional  vector  space)  as  a  matrix  multi¬ 
plication  map  after  suitable  relabeling  via  the  isomorphisms  given  by  M. 
Specifically,  if  T  G  C(V,  W)  and  we  identify  vGf  with  M(v)  G  Fw?1,  then 
the  result  above  says  that  we  can  identify  T v  with  A4(T)M(v). 
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Because  the  result  above  allows  us  to  think  (via  isomorphisms)  of  each 
linear  map  as  multiplication  on  F22’1  by  some  matrix  A,  keep  in  mind  that  the 
specific  matrix  A  depends  not  only  on  the  linear  map  but  also  on  the  choice 
of  bases.  One  of  the  themes  of  many  of  the  most  important  results  in  later 
chapters  will  be  the  choice  of  a  basis  that  makes  the  matrix  A  as  simple  as 
possible. 

In  this  book,  we  concentrate  on  linear  maps  rather  than  on  matrices.  How¬ 
ever,  sometimes  thinking  of  linear  maps  as  matrices  (or  thinking  of  matrices 
as  linear  maps)  gives  important  insights  that  we  will  find  useful. 

Operators 

Linear  maps  from  a  vector  space  to  itself  are  so  important  that  they  get  a 
special  name  and  special  notation. 

3.67  Definition  operator ;  C(V ) 

•  A  linear  map  from  a  vector  space  to  itself  is  called  an  operator. 

•  The  notation  C(V)  denotes  the  set  of  all  operators  on  V.  In  other 
words,  C(V )  =  C(V,  V ). 

A  linear  map  is  invertible  if  it  is 
injective  and  surjective.  For  an  op¬ 
erator,  you  might  wonder  whether  in¬ 
jectivity  alone,  or  surjectivity  alone, 
is  enough  to  imply  invertibility.  On 
infinite-dimensional  vector  spaces,  neither  condition  alone  implies  invert¬ 
ibility,  as  illustrated  by  the  next  example,  which  uses  two  familiar  operators 
from  Example  3.4. 

3.68  Example  neither  injectivity  nor  surjectivity  implies  invertibility 

•  The  multiplication  by  x2  operator  on  ^(R)  is  injective  but  not  surjective. 

•  The  backward  shift  operator  on  F°°  is  surjective  but  not  injective. 

In  view  of  the  example  above,  the  next  result  is  remarkable — it  states 
that  for  operators  on  a  finite-dimensional  vector  space,  either  injectivity  or 
surjectivity  alone  implies  the  other  condition.  Often  it  is  easier  to  check  that 
an  operator  on  a  finite-dimensional  vector  space  is  injective,  and  then  we  get 
surjectivity  for  free. 


The  deepest  and  most  important 
parts  of  linear  algebra,  as  well  as 
most  of  the  rest  of  this  book,  deal 
with  operators. 
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3.69  Injectivity  is  equivalent  to  surjectivity  in  finite  dimensions 

Suppose  V  is  finite-dimensional  and  T  e  C(V).  Then  the  following  are 
equivalent: 

(a)  T  is  invertible; 

(b)  T  is  injective; 

(c)  T  is  surjective. 

Proof  Clearly  (a)  implies  (b). 

Now  suppose  (b)  holds,  so  that  T  is  injective.  Thus  null  T  =  {0}  (by  3.16). 
From  the  Fundamental  Theorem  of  Linear  Maps  (3.22)  we  have 

dim  range  T  =  dim  V  —  dim  null  T 

=  dim  V. 


Thus  range  T  equals  V.  Thus  T  is  surjective.  Hence  (b)  implies  (c). 

Now  suppose  (c)  holds,  so  that  T  is  surjective.  Thus  range  T  =  V.  From 
the  Fundamental  Theorem  of  Linear  Maps  (3.22)  we  have 

dim  null  T  =  dim  V  —  dim  range  T 
=  0. 

Thus  null  T  equals  {0}.  Thus  T  is  injective  (by  3.16),  and  so  T  is  invertible 
(we  already  knew  that  T  was  surjective).  Hence  (c)  implies  (a),  completing 
the  proof.  ■ 

The  next  example  illustrates  the  power  of  the  previous  result.  Although 
it  is  possible  to  prove  the  result  in  the  example  below  without  using  linear 
algebra,  the  proof  using  linear  algebra  is  cleaner  and  easier. 


3.70  Example  Show  that  for  each  polynomial  q  e  'P(R),  there  exists  a 
polynomial  p  e  V(R)  with  ((x2  +  5x  +  l)p)  =  q. 

Solution  Example  3.68  shows  that  the  magic  of  3.69  does  not  apply  to  the 
infinite-dimensional  vector  space  7^(R).  However,  each  nonzero  polynomial 
q  has  some  degree  m.  By  restricting  attention  to  Vm(R),  we  can  work  with  a 
finite-dimensional  vector  space. 

Suppose  q  e  Vm(R).  Define  T :  Vm(R)  — ►  Vm(R)  by 

Tp  —  ((x2  +  5x  +  l)p)n . 
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Multiplying  a  nonzero  polynomial  by  ( x 2 3 4 5 6 7  +  5x  +  7)  increases  the  degree 
by  2,  and  then  differentiating  twice  reduces  the  degree  by  2.  Thus  T  is  indeed 
an  operator  on  Pm(R). 

Every  polynomial  whose  second  derivative  equals  0  is  of  the  form  ax  +  b , 
where  a,b  e  R.  Thus  null  T  =  {0}.  Hence  T  is  injective. 

Now  3.69  implies  that  T  is  surjective.  Thus  there  exists  a  polynomial 
p  G  Vm{ R)  such  that  ((x2  +  5x  +  l)p)  =  q ,  as  desired. 

Exercise  30  in  Section  6.  A  gives  a  similar  but  more  spectacular  application 
of  3.69.  The  result  in  that  exercise  is  quite  difficult  to  prove  without  using 
linear  algebra. 

EXERCISES  3.D 


1  Suppose  T  G  jC(U,V)  and  S  G  C(V,  W)  are  both  invertible  linear  maps. 
Prove  that  ST  G  C{U ,  JP)  is  invertible  and  that  (ST)-1  =  T~l  S~l . 

2  Suppose  V  is  finite-dimensional  and  dim  V  >  1.  Prove  that  the  set  of 
noninvertible  operators  on  V  is  not  a  subspace  of  C(V). 

3  Suppose  V  is  finite-dimensional,  U  is  a  subspace  of  V,  and  S  G  C(U,  V ). 
Prove  there  exists  an  invertible  operator  T  G  C(V)  such  that  Tu  =  Su 
for  every  u  G  U  if  and  only  if  S  is  injective. 

4  Suppose  W  is  finite-dimensional  and  T\ ,  T2  G  C{V ,  fL).  Prove  that 
nullTi  =  null  72  if  and  only  if  there  exists  an  invertible  operator 

5  G  C(W)  such  that  Tx  =  ST2. 

5  Suppose  V  is  finite-dimensional  and  T\ ,  T2  G  C(V,  W).  Prove  that 
range  T\  =  range  T2  if  and  only  if  there  exists  an  invertible  operator 
5  G  C(V)  such  that  T\  =  T2S. 

6  Suppose  V  and  W  are  finite-dimensional  and  T\ ,  T2  g  £(E,  fP).  Prove 
that  there  exist  invertible  operators  R  G  C(V)  and  S  e  C(W)  such  that 
7"i  =  ST2R  if  and  only  if  dim  null  T\  —  dim  null  T2. 

7  Suppose  V  and  W  are  finite-dimensional.  Let  v  G  V.  Let 

E  =  {T  g  £(7,  fP)  :  Tv  =  0}. 

(a)  Show  that  E  is  a  subspace  of  C(V,  W). 

(b)  Suppose  v  /  0.  What  is  dim  El 
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Suppose  V  is  finite-dimensional  and  T :  V  ->  W  is  a  surjective  linear 
map  of  V  onto  W.  Prove  that  there  is  a  subspace  U  of  V  such  that 
T\u  is  an  isomorphism  of  U  onto  W.  (Here  T \ u  means  the  function  T 
restricted  to  U.  In  other  words,  T\u  is  the  function  whose  domain  is  U, 
with  T\u  defined  by  T\u(u)  =  Tu  for  every  u  E  U.) 


Suppose  V  is  finite-dimensional  and  S,T  E  C(V).  Prove  that  ST  is 
invertible  if  and  only  if  both  S  and  T  are  invertible. 


Suppose  V  is  finite-dimensional  and  S,T  E  C(V).  Prove  that  ST  =  I 
if  and  only  if  TS  =  I. 

Suppose  V  is  finite-dimensional  and  S,T,U  E  C(V)  and  ST U  =  I. 
Show  that  T  is  invertible  and  that  T~l  =  US . 


Show  that  the  result  in  the  previous  exercise  can  fail  without  the  hypoth¬ 
esis  that  V  is  finite-dimensional. 

Suppose  V  is  a  finite-dimensional  vector  space  and  R,  S,T  e  C(V)  are 
such  that  RST  is  surjective.  Prove  that  S  is  injective. 

Suppose  v\, ...  ,vn  is  a  basis  of  V.  Prove  that  the  map  T :  V  — >  Fn,x 
defined  by 

Tv  =  M(y) 

is  an  isomorphism  of  V  onto  F72,1;  here  *M(v)  is  the  matrix  of  v  e  V 
with  respect  to  the  basis  v\ , . . . ,  vn . 

Prove  that  every  linear  map  from  F72,1  to  F722,1  is  given  by  a  matrix 
multiplication.  In  other  words,  prove  that  if  T  e  jC( f72’1,  F772’1),  then 
there  exists  an  m-by-n  matrix  A  such  that  Tx  =  Ax  for  every  x  e  F72,1. 

Suppose  V  is  finite-dimensional  and  T  e  C{V).  Prove  that  T  is  a  scalar 
multiple  of  the  identity  if  and  only  if  ST  =  TS  for  every  S  e  C{V). 

Suppose  V  is  finite-dimensional  and  £  is  a  subspace  of  C{V)  such  that 
ST  g  £  and  TS  E  £  for  all  S  E  C(V)  and  all  T  e  £.  Prove  that 
£  =  {0}  or  £  =  C(V). 

Show  that  V  and  £(F,  V)  are  isomorphic  vector  spaces. 

Suppose  T  E  C(V{ R))  is  such  that  T  is  injective  and  deg  Tp  <  deg  p 
for  every  nonzero  polynomial  p  E  V(R). 

(a)  Prove  that  T  is  surjective. 

(b)  Prove  that  deg  Tp  —  deg  p  for  every  nonzero  p  E  ^(R). 
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20  Suppose  n  is  a  positive  integer  and  Aij  e  F  for  i,  j  =  1, . . . ,  n.  Prove 
that  the  following  are  equivalent  (note  that  in  both  parts  below,  the 
number  of  equations  equals  the  number  of  variables): 

(a)  The  trivial  solution  x\  =  •  •  •  =  xn  =  0  is  the  only  solution  to  the 
homogeneous  system  of  equations 

n 

y  ]  A\,kxk  ~  o 

k—  1 


n 

k—  1 


(b)  For  every  c\, ...  ,cn  e  F,  there  exists  a  solution  to  the  system  of 
equations 


n 

y  ]  ^1  ,k%k  ~  c\ 

k—  1 


n 


y  '  ^n,kxk 


k—  1 
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Products  and  Quotients  of  Vector  Spaces 

Products  of  Vector  Spaces 

As  usual  when  dealing  with  more  than  one  vector  space,  all  the  vector  spaces 
in  use  should  be  over  the  same  field. 

3.71  Definition  product  of  vector  spaces 
Suppose  V\ , . . . ,  Vm  are  vector  spaces  over  F. 

•  The  product  V\  x  •  •  •  x  Vm  is  defined  by 

Vi  x  •  •  •  X  Vm  =  { (v x ,  .  .  .  ,  Vm )  \  V i  G  V\ ,  .  .  .  ,  Vm  G  Vm}. 

•  Addition  on  V\  x  •  •  •  x  Vm  is  defined  by 

(u  x  ,  .  .  .  ,  Ujn)  (vx ,  .  .  .  ,  Vjn)  =  ( U\  +  Vx , . . . ,  um  +  vm\ 

•  Scalar  multiplication  on  V\  x  •  •  •  x  Vm  is  defined  by 

A(vx, . . . ,  Vm)  —  (Avx , . . . ,  Xvm). 


3.E 


3.72  Example  Elements  of  V2 (R)  x  R3  are  lists  of  length  2,  with  the 
first  item  in  the  list  an  element  of  ^(R)  and  the  second  item  in  the  list  an 
element  of  R3 . 

For  example,  (5  —  6x  +  4x2,  (3,  8, 7))  G  P2( R)  x  R3. 

The  next  result  should  be  interpreted  to  mean  that  the  product  of  vector 
spaces  is  a  vector  space  with  the  operations  of  addition  and  scalar  multiplica¬ 
tion  as  defined  above. 

3.73  Product  of  vector  spaces  is  a  vector  space 

Suppose  V\ , . . . ,  Vm  are  vector  spaces  over  F.  Then  V\  x  •  •  •  x  Vm  is  a 
vector  space  over  F. 

The  proof  of  the  result  above  is  left  to  the  reader.  Note  that  the  additive 
identity  of  V\  x  •  •  •  x  Vm  is  (0, . . . ,  0),  where  the  0  in  the  7  th  slot  is  the 
additive  identity  of  V j .  The  additive  inverse  of  (vx , . . . ,  vm)  G  V\  x  •  •  •  x  Vm 
is  (-vx, . . . ,  -vm). 
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3.74  Example  Is  R2  x  R3  equal  to  R5?  Is  R2  x  R3  isomorphic  to  R5? 

Solution  Elements  of  R2  x  R3  are  lists  ((xi,  X2),  (X3,  X4,  X5)),  where 
x1,x2,x3,x4,x5  G  R. 

Elements  of  R5  are  lists  (xi,  X2,  X3,  X4 ,  X5),  where  x\,  X2,  X3,  X4,  X5  G  R. 
Although  these  look  almost  the  same,  they  are  not  the  same  kind  of  object. 
Elements  of  R2  x  R3  are  lists  of  length  2  (with  the  first  item  itself  a  list  of 
length  2  and  the  second  item  a  list  of  length  3),  and  elements  of  R5  are  lists 
of  length  5.  Thus  R2  x  R3  does  not  equal  R5. 

The  linear  map  that  takes  a  vector  ((xi ,  X2),  (X3,  X4,  X5))  G  R2  x  R3  to 
(xi,  X2,  X3,  X4,  X5)  G  R5  is  clearly  an  isomorphism  of  R2  x  R3  onto  R5. 
Thus  these  two  vector  spaces  are  isomorphic. 

In  this  case,  the  isomorphism  is  so  natural  that  we  should  think  of  it  as  a 
relabeling.  Some  people  would  even  informally  say  that  R2  x  R3  equals  R5, 
which  is  not  technically  correct  but  which  captures  the  spirit  of  identification 
via  relabeling. 

The  next  example  illustrates  the  idea  of  the  proof  of  3.76. 

3.75  Example  Find  a  basis  of  V2 (R)  x  R2. 

Solution  Consider  this  list  of  length  5  of  elements  of  7^2  (R)  x  R2: 

(1,  (0, 0)),  (x,  (0, 0)),  (x2,  (0, 0)),  (0,  (1,0)),  (0,  (0, 1)). 

The  list  above  is  linearly  independent  and  it  spans  V2  (R)  x  R2.  Thus  it  is  a 
basis  of  7^2  (R)  x  R2. 


3.76  Dimension  of  a  product  is  the  sum  of  dimensions 

Suppose  are  finite-dimensional  vector  spaces.  Then 

Vi  x  •  •  •  x  Vm  is  finite-dimensional  and 

dim(Fi  x  •  •  •  x  Vm)  =  dim  V\  H - +  dim  Vm. 


Proof  Choose  a  basis  of  each  V j .  For  each  basis  vector  of  each  V j ,  consider 
the  element  of  V\  x  •  •  •  x  Vm  that  equals  the  basis  vector  in  the  j th  slot  and 
0  in  the  other  slots.  The  list  of  all  such  vectors  is  linearly  independent  and 
spans  V\  x  •  •  •  x  Vm .  Thus  it  is  a  basis  of  V\  x  •  •  •  x  Vm .  The  length  of  this 
basis  is  dim  V\  +  •  •  •  +  dim  Vm,  as  desired.  ■ 
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Products  and  Direct  Sums 

In  the  next  result,  the  map  T  is  surjective  by  the  definition  of  U\  +  •  •  •  +  Um. 
Thus  the  last  word  in  the  result  below  could  be  changed  from  “injective”  to 
“invertible”. 

3.77  Products  and  direct  sums 

Suppose  that  Um  are  subspaces  of  V.  Define  a  linear  map 

T  :  U\  X  •••  X  Um  — >  U\  +  •  •  •  +  £/m  by 


T ( U 1 ,  .  .  .  ,  Um)  —  +  *  *  *  +  Mm,' 

Then  U\  +  •  •  •  +  Um  is  a  direct  sum  if  and  only  if  T  is  injective. 

Proof  The  linear  map  T  is  injective  if  and  only  if  the  only  way  to  write  0  as  a 
sum  u\  +  •••  +  %,  where  each  uj  is  in  Uj,  is  by  taking  each  uj  equal  to  0. 
Thus  1.44  shows  that  T  is  injective  if  and  only  if  U\  +  •  •  •  +  Um  is  a  direct 
sum,  as  desired.  ■ 

3.78  A  sum  is  a  direct  sum  if  and  only  if  dimensions  add  up 

Suppose  V  is  finite-dimensional  and  U\ , . . . ,  Um  are  subspaces  of  V.  Then 
U\  +  •  •  •  +  Um  is  a  direct  sum  if  and  only  if 

dim(£/i  +  •  •  •  +  Um)  =  dim  U\  +  •  •  •  +  dim  Um. 

Proof  The  map  T  in  3.77  is  surjective.  Thus  by  the  Fundamental  Theorem 
of  Linear  Maps  (3.22),  T  is  injective  if  and  only  if 


dim(£/i  4 - +  Um)  —  dim(£/i  x  •  •  •  x  Um). 

Combining  3.77  and  3.76  now  shows  that  U\  +  •  •  •  +  Um  is  a  direct  sum  if 
and  only  if 


dim([/i  4 - b  Um)  =  dim  U\  4 - b  dim  Um, 


as  desired.  ■ 

In  the  special  case  m  =  2,  an  alternative  proof  that  U\  +  U2  is  a  direct 
sum  if  and  only  if  dim([/i  +  U2)  =  dim  U\  +  dim  U2  can  be  obtained  by 
combining  1.45  and  2.43. 
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Quotients  of  Vector  Spaces 

We  begin  our  approach  to  quotient  spaces  by  defining  the  sum  of  a  vector  and 
a  sub  space. 

3.79  Definition  v  +  U 

Suppose  v  £  V  and  U  is  a  subspace  of  V.  Then  v  +  U  is  the  subset  of  V 
defined  by 

V+U  =  {v  +  U  li  £  (/}. 


3.80  Example  Suppose 

U  =  {(x,  2x )  £  R2  :  x  £  R}. 

Then  U  is  the  line  in  R2  through  the  origin 
with  slope  2.  Thus 

(17,  20 )  +  U 

is  the  line  in  R2  that  contains  the  point 
(17, 20)  and  has  slope  2. 


3.81  Definition  affine  subset ,  parallel 

•  An  affine  subset  of  V  is  a  subset  of  V  of  the  form  v  +  U  for  some 
v  £  V  and  some  subspace  U  of  V. 

•  For  v  £  V  and  U  a  subspace  of  V,  the  affine  subset  v  +  U  is  said  to 
be  parallel  to  U. 


3.82  Example  parallel  affine  subsets 

•  In  Example  3.80  above,  all  the  lines  in  R2  with  slope  2  are  parallel  to  U. 

•  If  U  =  {(x,y,0)  £  R3  :  x,y  e  R},  then  the  affine  subsets  of  R3 
parallel  to  U  are  the  planes  in  R3  that  are  parallel  to  the  xy  -plane  U  in 
the  usual  sense. 

Important:  With  the  definition  of  parallel  given  in  3.81,  no  line  in  R3 
is  considered  to  be  an  affine  subset  that  is  parallel  to  the  plane  U. 
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3.83  Definition  quotient  space,  V/U 

Suppose  U  is  a  subspace  of  V.  Then  the  quotient  space  V /  U  is  the  set  of 
all  affine  subsets  of  V  parallel  to  U.  In  other  words, 

V/U  =  {v+  U  :  v  G  V}. 


3.84  Example  quotient  spaces 

•  If  U  —  {(x,  2x)  G  R2  :  x  G  R},  then  R2/  U  is  the  set  of  all  lines  in 
R2  that  have  slope  2. 

•  If  U  is  a  line  in  R3  containing  the  origin,  then  R3/  U  is  the  set  of  all 
lines  in  R3  parallel  to  U. 

•  If  U  is  a  plane  in  R3  containing  the  origin,  then  R3/  U  is  the  set  of  all 
planes  in  R3  parallel  to  U. 

Our  next  goal  is  to  make  V/U  into  a  vector  space.  To  do  this,  we  will 
need  the  following  result. 

3.85  Two  affine  subsets  parallel  to  U  are  equal  or  disjoint 

Suppose  U  is  a  subspace  of  V  and  v,  w  G  V.  Then  the  following  are 
equivalent: 

(a)  v  —  w  G  U\ 

(b)  v+U  =  w  +  U; 

(c)  (v  +  U)n(w  +  U)^  0. 

Proof  First  suppose  (a)  holds,  so  v  —  w  G  U.  If  u  G  U,  then 

v  +  u  =  w  +  ((v  —  w)  +  u)  ew  +  U. 

Thus  v  +  U  C  w  +  U.  Similarly,  w  +  U  C  v+U.  Thus  v  +  U  =  w  +  U, 
completing  the  proof  that  (a)  implies  (b). 

Obviously  (b)  implies  (c). 

Now  suppose  (c)  holds,  so  (v  +  U)  D  (w  +  U)  ^  0.  Thus  there  exist 
u\,U2  G  U  such  that 

V  +  U\  —  W  +  U2 • 

Thus  v  —  w  =  U2  —  u\.  Hence  v  —  w  G  U,  showing  that  (c)  implies  (a)  and 
completing  the  proof.  ■ 
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Now  we  can  define  addition  and  scalar  multiplication  on  V/U. 

3.86  Definition  addition  and  scalar  multiplication  on  V/U 

Suppose  U  is  a  subspace  of  V.  Then  addition  and  scalar  multiplication 
are  defined  on  V /  U  by 

(v  +  U)  +  (w  +  U)  =  (v  +  w)  +  U 
A(v  +  U)  =  (Av)  +  U 

for  v,  w  g  V  and  A  G  F. 

As  part  of  the  proof  of  the  next  result,  we  will  show  that  the  definitions 
above  make  sense. 

3.87  Quotient  space  is  a  vector  space 

Suppose  U  is  a  subspace  of  V.  Then  V/U,  with  the  operations  of  addition 
and  scalar  multiplication  as  defined  above,  is  a  vector  space. 

Proof  The  potential  problem  with  the  definitions  above  of  addition  and  scalar 
multiplication  on  V /  U  is  that  the  representation  of  an  affine  subset  parallel  to 
U  is  not  unique.  Specifically,  suppose  v,  w  G  V.  Suppose  also  that  v,w  G  V 
are  such  that  v  +  U  =  v  +  U  and  w  +  U  =  w  +  U.  To  show  that  the 
definition  of  addition  on  V/U  given  above  makes  sense,  we  must  show  that 
(v  +  w)  +  U  =  (v  +  w)  +  U. 

By  3.85,  we  have 

v  —  v  G  U  and  w  —  w  G  U. 

Because  U  is  a  subspace  of  V  and  thus  is  closed  under  addition,  this  implies 
that  (v  —  v)  +  (w  —  w)  G  U.  Thus  (v  +  w)  —  (v  +  w)  G  U.  Using  3.85  again, 
we  see  that 

(v  +  w)  +  U  =  (v  +  w)  +  U, 

as  desired.  Thus  the  definition  of  addition  on  V/U  makes  sense. 

Similarly,  suppose  X  G  F.  Because  U  is  a  subspace  of  V  and  thus  is 
closed  under  scalar  multiplication,  we  have  A(v  —  v)  G  U.  Thus  Xv  —  Xv  G  U. 
Hence  3.85  implies  that  (Av)  +  U  =  (Av)  +  U.  Thus  the  definition  of  scalar 
multiplication  on  V/U  makes  sense. 

Now  that  addition  and  scalar  multiplication  have  been  defined  on  V/U,  the 
verification  that  these  operations  make  V/U  into  a  vector  space  is  straightfor¬ 
ward  and  is  left  to  the  reader.  Note  that  the  additive  identity  of  V/U  is  0  +  U 
(which  equals  U)  and  that  the  additive  inverse  of  v  +  U  is  (— v)  +  U.  m 
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The  next  concept  will  give  us  an  easy  way  to  compute  the  dimension 
of  V/  U. 

3.88  Definition  quotient  map,  ix 

Suppose  U  is  a  subspace  of  V.  The  quotient  map  tx  is  the  linear  map 

tx :  V  — >  V/U  defined  by 

i r(v)  =  v  T  U 

for  v  G  V. 

The  reader  should  verify  that  tx  is  indeed  a  linear  map.  Although  tx 
depends  on  U  as  well  as  V,  these  spaces  are  left  out  of  the  notation  because 
they  should  be  clear  from  the  context. 

3.89  Dimension  of  a  quotient  space 

Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  Then 

dim  V/U  =  dim  V  —  dim  U. 


Proof  Let  tx  be  the  quotient  map  from  V  to  V/U.  From  3.85,  we  see  that 
null  tx  —  U.  Clearly  range  ix  —  V /  U.  The  Fundamental  Theorem  of  Linear 
Maps  (3.22)  thus  tells  us  that 

dim  V  =  dim  U  +  dim  V/U, 

which  gives  the  desired  result.  ■ 

Each  linear  map  T  on  V  induces  a  linear  map  T  on  K/(null  /  ).  which  we 
now  define. 

3.90  Definition  f 

Suppose  T  e  C(V,  W).  Define  f :  E/(null  T)  W  by 

fO  +  nulir)  =  Tv. 

To  show  that  the  definition  of  T  makes  sense,  suppose  u,v  e  V  are  such 
that  u  +  null  T  =  v  +  null  T.  By  3.85,  we  have  u  —  v  e  null  T.  Thus 
T(u  —  v)  =  0.  Hence  Tu  —  Tv.  Thus  the  definition  of  T  indeed  makes 


sense. 
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3.91  Null  space  and  range  of  f 
Suppose  T  e  C(V,  W ).  Then 

(a)  T  is  a  linear  map  from  L/(null  T)  to  W\ 

(b)  T  is  injective; 

(c)  range  f  =  range  T ; 

(d)  L/(null  T)  is  isomorphic  to  range  T. 

Proof 

(a)  The  routine  verification  that  T  is  linear  is  left  to  the  reader. 

(b)  Suppose  v  G  V  and  f(y  +  null  T)  =  0.  Then  Tv  =  0.  Thus  v  e  null  T. 

Hence  3.85  implies  that  v  +  null  T  —  0  +  null  T.  This  implies  that 
null  T  —  0,  and  hence  T  is  injective,  as  desired. 

(c)  The  definition  of  T  shows  that  range  T  =  range  T. 

(d)  Parts  (b)  and  (c)  imply  that  if  we  think  of  T  as  mapping  into  range  T, 

then  T  is  an  isomorphism  from  F/(null  T)  onto  range  T.  ■ 


EXERCISES  3.E 


1  Suppose  T  is  a  function  from  V  to  W.  The  graph  of  T  is  the  subset  of 
V  x  W  defined  by 

graph  of  T  —  {(v,  Tv)  eVxWweV}. 

Prove  that  T  is  a  linear  map  if  and  only  if  the  graph  of  T  is  a  subspace 
of  V  x  W. 

[Formally,  a  function  T  from  V  to  W  is  a  subset  T  of  V  x  W  such  that 
for  each  v  E  V,  there  exists  exactly  one  element  (v,  w)  e  T.  In  other 
words,  formally  a  function  is  what  is  called  above  its  graph.  We  do 
not  usually  think  of  functions  in  this  formal  manner.  However,  if  we  do 
become  formal,  then  the  exercise  above  could  be  rephrased  as  follows: 
Prove  that  a  function  T  from  V  to  W  is  a  linear  map  if  and  only  if  T  is 
a  sub  space  of  V  x  W.] 
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Suppose  l  ’i . I  ,„  are  vector  spaces  such  that  V\  x  •  •  •  x  V,„  is  finite¬ 
dimensional.  Prove  that  V j  is  finite-dimensional  for  each  j  =  1 . in. 

Give  an  example  of  a  vector  space  V  and  subspaces  U\ ,  U2  of  V  such 
that  U\  x  U2  is  isomorphic  to  U\  +  U2  but  U\  +  U2  is  not  a  direct  sum. 

Hint:  The  vector  space  V  must  be  infinite-dimensional. 

Suppose  V\ , . . . ,  Vm  are  vector  spaces.  Prove  that  £(  V\  x  •  •  •  x  Vm ,  W) 
and  C(V\,  W)  x  •  •  •  x  C(Vm,  W )  are  isomorphic  vector  spaces. 

Suppose  W\, ,  Wm  are  vector  spaces.  Prove  that  C(V ,  W\  x  •  •  •  x  Wm) 
and£(F,W"i)  x  •  •  •  x  £(F,  are  isomorphic  vector  spaces. 

For  n  a  positive  integer,  define  Vn  by 

Vn  =  V  x  •  •  •  x  V  . 

" - V - " 

n  times 

Prove  that  Vn  and  C(Fn ,  V )  are  isomorphic  vector  spaces. 

Suppose  v,  x  are  vectors  in  V  and  U,  W  are  subspaces  of  V  such  that 
v  +  U  =  x  +  W.  Prove  that  U  =  W. 

Prove  that  a  nonempty  subset  A  of  V  is  an  affine  subset  of  V  if  and  only 
if  Xv  +  (1  —  X)w  G  A  for  all  v,  w  G  A  and  all  X  G  F. 

Suppose  A 1  and  A2  are  affine  subsets  of  V.  Prove  that  the  intersection 
^4i  D  A2  is  either  an  affine  subset  of  V  or  the  empty  set. 

Prove  that  the  intersection  of  every  collection  of  affine  subsets  of  V  is 
either  an  affine  subset  of  V  or  the  empty  set. 

Suppose  v\, ...  ,vm  £  V.  Let 

A  =  {A1V1  +  •••  +  Xmvm  :  X\ , . . . ,  Xm  gF  and  X\  +  --  *  +  Am  =  1}. 

(a)  Prove  that  A  is  an  affine  subset  of  V. 

(b)  Prove  that  every  affine  subset  of  V  that  contains  v\ , . . . ,  vm  also 
contains  A. 

(c)  Prove  that  A  =  v  +  U  for  some  v  G  V  and  some  subspace  U  of 
V  with  dim  U  <  m  —  1 . 

Suppose  U  is  a  subspace  of  V  such  that  V/U  is  finite-dimensional. 
Prove  that  V  is  isomorphic  to  U  x  (V/U). 
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13  Suppose  U  is  a  subspace  of  V  and  v\  +  U, . . . ,  vm  +  U  is  a  basis  of 
V/U  and  u\, ...  ,un  is  a  basis  of  U.  Prove  that  v\ , . . . ,  vm ,  u  \ , . . . ,  un 
is  a  basis  of  V. 

14  Suppose  U  =  {(xi ,  X2, . . . )  G  F°°  :  xj  ^  0  for  only  finitely  many  j  }. 

(a)  Show  that  U  is  a  subspace  of  F°°. 

(b)  Prove  that  F°°/  U  is  infinite-dimensional. 

15  Suppose  cp  G  C(V,  F)  and  cp  ^  0.  Prove  that  dim  F/(null<p)  =  1. 

16  Suppose  U  is  a  subspace  of  V  such  that  dim  V/U  =  1.  Prove  that  there 
exists  <p  G  C(V,  F)  such  that  null  <p  =  U. 

17  Suppose  U  is  a  subspace  of  V  such  that  V/U  is  finite-dimensional. 
Prove  that  there  exists  a  subspace  W  of  V  such  that  dim  W  =  dim  V/U 
and  V  =  U®W. 

18  Suppose  T  g  C(V,  W)  and  U  is  a  subspace  of  V.  Let  7T  denote  the 
quotient  map  from  V  onto  V/U.  Prove  that  there  exists  S  G  C(V / U,W) 
such  that  T  =  S  o  n  if  and  only  if  U  C  null  T. 

19  Find  a  correct  statement  analogous  to  3.78  that  is  applicable  to  finite 
sets,  with  unions  analogous  to  sums  of  subspaces  and  disjoint  unions 
analogous  to  direct  sums. 

20  Suppose  U  is  a  subspace  of  V.  Define  T :  C(V/U,  W )  — ►  C(V,  W )  by 

T(S)  =  S  ott. 

(a)  Show  that  T  is  a  linear  map. 

(b)  Show  that  T  is  injective. 

(c)  Show  that  range  T  =  {T  g  £(F,  IF)  :  =  0  for  every  u  e  U}. 
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Duality 

The  Dual  Space  and  the  Dual  Map 

Linear  maps  into  the  scalar  field  F  play  a  special  role  in  linear  algebra,  and 
thus  they  get  a  special  name: 

3.92  Definition  linear  functional 

A  linear  functional  on  V  is  a  linear  map  from  V  to  F.  In  other  words,  a 
linear  functional  is  an  element  of  C{V,  F). 


3.F 


3.93  Example  linear  functionals 

•  Define  cp :  R3  — >  R  by  cp(x,  y,  z)  =  4x  —  5y  +  2 z.  Then  <p  is  a  linear 
functional  on  R3 . 

•  Fix  {c\ , . . . ,  cn)  G  Fn .  Define  <p :  Fw  ->  F  by 

<p{X\  ,  .  .  .  ,  Xn)  =  C\X\  +  •  •  •  +  cnxn . 

Then  cp  is  a  linear  functional  on  Fw . 

•  Define  cp :  V(R)  R  by  (pip)  =  3/7r/(5)  +  7/7(4).  Then  cp  is  a  linear 
functional  on  ^(R). 

•  Define  <p\  P(R)  — >  R  by  <^(/7)  =  /q1  p(x)  dx.  Then  cp  is  a  linear 
functional  on  ^(R). 

The  vector  space  C{V,  F)  also  gets  a  special  name  and  special  notation: 
3.94  Definition  dual  space,  V' 

The  dual  space  of  V,  denoted  V\  is  the  vector  space  of  all  linear 
functionals  on  V.  In  other  words,  V'  —  C{V,  F). 


3.95  dim  V'  —  dim  V 

Suppose  V  is  finite-dimensional.  Then  V'  is  also  finite-dimensional  and 
dim  V'  —  dim  V. 


Proof  This  result  follows  from  3.61. 
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In  the  following  definition,  3.5  implies  that  each  cpj  is  well  defined. 

3.96  Definition  dual  basis 

If  vi , . . . ,  vn  is  a  basis  of  V,  then  the  dual  basis  of  vi , . . . ,  vn  is  the  list 
cpi , . . . ,  cpn  of  elements  of  V\  where  each  cpj  is  the  linear  functional  on  V 
such  that 

( 1  if  k  =  j, 

<Pj(n)  =  j  ,  ,  . 

(0  if  k  jz  j. 


3.97  Example  What  is  the  dual  basis  of  the  standard  basis  e\, . . . ,  en 
ofF"? 


Solution  For  1  <  j  <n,  define  cpj  to  be  the  linear  functional  on  Fn  that 
selects  the  j lh  coordinate  of  a  vector  in  Fn.  In  other  words, 


for  (xi , . . . ,  xn)  eFn .  Clearly 


if  k  =  j, 
if  k  #  j. 


Thus  cp\ , . . . ,  cpn  is  the  dual  basis  of  the  standard  basis  e\ , . . . ,  en  ofFn. 


The  next  result  shows  that  the  dual  basis  is  indeed  a  basis.  Thus  the 
terminology  “dual  basis”  is  justified. 

3.98  Dual  basis  is  a  basis  of  the  dual  space 

Suppose  V  is  finite-dimensional.  Then  the  dual  basis  of  a  basis  of  V  is  a 
basis  of  V'. 

Proof  Suppose  vi , . . . ,  vn  is  a  basis  of  V.  Let  <p\ , . . . ,  cpn  denote  the  dual 
basis. 

To  show  that  <p\ , . . . ,  <pn  is  a  linearly  independent  list  of  elements  of  Vf, 
suppose  a  i , . . . ,  an  e  F  are  such  that 

^1^1  +  *  *  *  +  ttntyn  —  0. 

Now  +  •••  +  an(pn)(vj)  =  aj  for  j  —  1 The  equation 

above  thus  shows  that  a\  =  •  •  •  =  an  =  0.  Hence  <p\, . . . ,  <pn  is  linearly 
independent. 

Now  2.39  and  3.95  imply  that  cp\ , . . . ,  cpn  is  a  basis  of  V'.  m 
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In  the  definition  below,  note  that  if  T  is  a  linear  map  from  V  to  W  then  T' 
is  a  linear  map  from  W'  to  V'. 

3.99  Definition  dual  map,  T’ 

If  T  G  C{V ,  W ),  then  the  dual  map  of  T  is  the  linear  map  T'  G  C{W' ,  V') 
defined  by  T'{ap)  =  cp  o  T  for  ip  g  IT'. 

If  T  g  £(K,  WO  and  <p  c  IF',  then  r'(<p)  is  defined  above  to  be  the 
composition  of  the  linear  maps  ip  and  T.  Thus  T' {ip)  is  indeed  a  linear  map 
from  V  to  F;  in  other  words,  T'{ap)  G  T' 

The  verification  that  T'  is  a  linear  map  from  IT'  to  V '  is  easy: 

•  If  cp,  i/s  G  IT'  then 

T' (ip  +  \j/)  =  (cp  +  \f/)  o  T  =  cp  o  T  +  i/s  o  T  =  T'{ap)  +  T'{xj/). 

•  If  X  G  F  and  cp  G  IT'  then 

T'(A<p)  =  (X<p)  oT  =  X(<p  oT)  =  XT' {ip). 

In  the  next  example,  the  prime  notation  is  used  with  two  unrelated  mean¬ 
ings:  D'  denotes  the  dual  of  a  linear  map  D ,  and  p'  denotes  the  derivative  of 
a  polynomial  p. 

3.100  Example  Define  D  :  ^(R)  — ►  7^(R)  by  Dp  =  p'. 

•  Suppose  ip  is  the  linear  functional  on  ^(R)  defined  by  ip{p)  =  p{J>)- 
Then  D'  {ip)  is  the  linear  functional  on  7^(R)  given  by 

(D'{<p))(p)  =  (<po  D)(p)  =  <p{Dp)  =  <p(p')  =  p\ 3). 

In  other  words,  D' {ip)  is  the  linear  functional  on  7^(R)  that  takes  p  to 

/(3). 

•  Suppose  ip  is  the  linear  functional  on  ^(R)  defined  by  <p{p)  =  f0 )  p. 
Then  D'  {ip)  is  the  linear  functional  on  7^(R)  given  by 

(D'(<P))(P)  =  (<P°D)(p)  =  <p(Dp)  =  tp(p')  =  f  p'  =  p(l)—p(0). 

Jo 

In  other  words,  D'  {ip)  is  the  linear  functional  on  V{R)  that  takes  p  to 

p{\)  -  p{0). 
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The  first  two  bullet  points  in  the  result  below  imply  that  the  function  that 
takes  T  to  T'  is  a  linear  map  from  C(V,  W)  to  C(W' ,  V'). 

In  the  third  bullet  point  below,  note  the  reversal  of  order  from  S  T  on  the 
left  to  T'S'  on  the  right  (here  we  assume  that  U  is  a  vector  space  over  F). 

3.101  Algebraic  properties  of  dual  maps 

•  (S  +  T)'  =  S' +  T'  for  all  S,T  e  C(V,  W). 

•  (XT)'  =  XT'  for  allA  e  F  and  all  T  e  C(V,  W). 

•  (ST)'  =  T'S'  for  all  T  e  C(U,  V)  and  all  S  e  C(V,  W). 

Proof  The  proofs  of  the  first  two  bullet  points  above  are  left  to  the  reader. 
To  prove  the  third  bullet  point,  suppose  cp  G  W'.  Then 

(ST)'(<p)  =  cpo(ST)  =  (cpoS)oT  =  T'(cpoS )  =  T\S\cp))  =  (T'S')(cp), 

where  the  first,  third,  and  fourth  equal¬ 
ities  above  hold  because  of  the  defini¬ 
tion  of  the  dual  map,  the  second  equality 
holds  because  composition  of  functions 
is  associative,  and  the  last  equality  fol¬ 
lows  from  the  definition  of  composition. 

The  equality  of  the  first  and  last 
terms  above  for  all  cp  e  W'  means  that 
(ST)'  =  T'S'.  m 

The  Null  Space  and  Range  of  the  Dual  of  a  Linear  Map 

Our  goal  in  this  subsection  is  to  describe  null  T'  and  range  T'  in  terms  of 
range  T  and  null  T.  To  do  this,  we  will  need  the  following  definition. 

3.102  Definition  annihilator ;  U° 

For  U  C  V,  the  annihilator  of  U,  denoted  £/°,  is  defined  by 

U°  =  {<p  e  V'  :  <p(u)  =  0  for  all  u  e  U}. 


Some  books  use  the  notation  F* 
and  T*  for  duality  instead  of  V' 
and  T'.  However,  here  we  reserve 
the  notation  T*  for  the  adjoint, 
which  will  be  introduced  when  we 
study  linear  maps  on  inner  product 
spaces  in  Chapter  7. 


3.103  Example  Suppose  U  is  the  subspace  of  7^(R)  consisting  of  all 
polynomial  multiples  of  x2.  If  <p  is  the  linear  functional  on  V(K)  defined  by 
(p(p)  =  //(0),  then  cp  G  U°. 
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For  U  C  V,  the  annihilator  U°  is  a  subset  of  the  dual  space  V'.  Thus  U° 
depends  on  the  vector  space  containing  U,  so  a  notation  such  as  Uy  would  be 
more  precise.  However,  the  containing  vector  space  will  always  be  clear  from 
the  context,  so  we  will  use  the  simpler  notation  U°. 

3.104  Example  Let  e\,  e2,  e$,  C4,  C5  denote  the  standard  basis  of  R5,  and 
let  (pi,(p2,<P3,<P4,  <P5  denote  the  dual  basis  of  (R5) .  Suppose 

U  =  span(^i, e2)  =  {(vi , X2, 0, 0, 0)  g  R5  :  x\,x2  €R}. 

Show  that  U°  =  span(<^3,  ^4,  <ps). 

Solution  Recall  (see  3.97)  that  cpj  is  the  linear  functional  on  R5  that  selects 
that  j th  coordinate:  cpj  ,  x2,  X3,  X4,  X5)  =  xj . 

First  suppose  <p  G  span(<^3,  ^4,  cps).  Then  there  exist  C3,  C4,  C5  G  R  such 
that  <p  =  6*3^3  +  C4<p4  +  c^cps.  If  (xi ,  x2,  0,  0,  0)  G  U,  then 

(p{x  1,X2,  0,0,0)  =  (c3(P3  +  c4^4  +  c5cp5)(x  1,  *2,  0,0,0)  =  0. 

Thus  <p  e  U°.  In  other  words,  we  have  shown  that  span(<^3,  cp4,  cps)  C  U°. 

To  show  the  inclusion  in  the  other  direction,  suppose  <p  G  U°.  Because 
the  dual  basis  is  a  basis  of  (R5) ,  there  exist  c\ ,  C2,  C3,  C4,  C5  G  R  such  that 
cp  =  c\(p\  +  C2(P2  +  C3(p3  +  C4(p4  +  6*5^5.  Because  e\  G  U  and  cp  G  U°,  we 
have 


0  =  <p(ei)  =  (c\(p\  +  C2(p2  +  C3(P3  +  C4(P4  +  C5(p5)(e  1)  =  c  1. 

Similarly,  e2  £  U  and  thus  C2  =  0.  Hence  cp  =  6*3^3  +  £4(^4  +  c^cps.  Thus 
cp  G  span(<^3,  <^4,  ^5),  which  shows  that  U°  C  span(<^3,  <^4,  <^5). 


3.105  The  annihilator  is  a  subspace 

Suppose  U  C  V.  Then  U°  is  a  subspace  of  V'. 


Proof  Clearly  0  G  U°  (here  0  is  the  zero  linear  functional  on  V),  because 
the  zero  linear  functional  applied  to  every  vector  in  U  is  0. 

Suppose  (p,  \jf  G  U°.  Thus  G  F  and  <p(u)  =  \fs(u)  =  0  for  every 
u  G  U.  If  u  G  U,  then  (cp  +  '/)(u )  =  <^(w)  +  x//(u)  =  0  +  0  =  0.  Thus 

(p  +  iff  G  C/°. 

Similarly,  [7°  is  closed  under  scalar  multiplication.  Thus  1.34  implies  that 
U°  is  a  subspace  of  Vr.  m 
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The  next  result  shows  that  dim  U°  is  the  difference  of  dim  V  and  dim  U. 
For  example,  this  shows  that  if  U  is  a  2-dimensional  subspace  of  R5,  then  U° 
is  a  3-dimensional  subspace  of  (R5) ,  as  in  Example  3.104. 

The  next  result  can  be  proved  following  the  pattern  of  Example  3.104: 
choose  a  basis  u  \ , . . . ,  um  of  U,  extend  to  a  basis  u  \ , . . . ,  um , . . . ,  un  of  V, 
let  <p\ , . . . ,  (pm, . . . ,  cpn  be  the  dual  basis  of  V!,  and  then  show  (pm+\, . . . ,  cpn 
is  a  basis  of  U°,  which  implies  the  desired  result. 

You  should  construct  the  proof  outlined  in  the  paragraph  above,  even 
though  a  slicker  proof  is  presented  here. 

3.106  Dimension  of  the  annihilator 

Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  Then 

dim  U  +  dim  U°  =  dim  V. 

Proof  Let  /  G  C(U,  V)  be  the  inclusion  map  defined  by  i(u)  =  u  for  u  e  U. 
Thus  V  is  a  linear  map  from  V'  to  U\  The  Fundamental  Theorem  of  Linear 
Maps  (3.22)  applied  to  V  shows  that 

dim  range  i 1  +  dim  null  i '  =  dim  V' . 

However,  null  V  =  U°  (as  can  be  seen  by  thinking  about  the  definitions)  and 
dim  V'  —  dim  V  (by  3.95),  so  we  can  rewrite  the  equation  above  as 

dim  range  i'  +  dim  U°  =  dim  V. 

If  (p  G  U\  then  cp  can  be  extended  to  a  linear  functional  \[r  on  V  (see, 
for  example,  Exercise  11  in  Section  3.A).  The  definition  of  V  shows  that 
=  (p.  Thus  <p  G  range/',  which  implies  that  range/7  =  U'.  Hence 
dim  range  V  —  dim  U'  —  dim  U,  and  the  displayed  equation  above  becomes 
the  desired  result.  ■ 

The  proof  of  part  (a)  of  the  result  below  does  not  use  the  hypothesis  that 
V  and  W  are  finite-dimensional. 

3.107  The  null  space  of  T' 

Suppose  V  and  W  are  finite-dimensional  and  T  G  C(V,  W).  Then 

(a)  nullT'  =  (rangeT)0; 

(b)  dim  null  T'  =  dim  null  T  +  dim  W  —  dim  V. 
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Proof 

(a)  First  suppose  <p  e  null  Tf.  Thus  0  =  T'icp)  =  <p  o  T.  Hence 

0  =  (cp  o  T)(v)  =  cp(T v)  for  every  v  G  V. 

Thus  cp  e  (range  T)°.  This  implies  that  null  T'  C  (range  T)°. 

To  prove  the  inclusion  in  the  opposite  direction,  now  suppose  that 
<p  e  (rangeT)0.  Thus  <p(Tv)  =  0  for  every  vector  v  e  V.  Hence 
0  =  (p  o  T  =  T\cp).  In  other  words,  cp  e  null  T\  which  shows  that 
(range  T)°  C  null  T\  completing  the  proof  of  (a). 

(b)  We  have 

dim  null  Tf  =  dim  (range  T)° 

=  dim  W  —  dim  range  T 
=  dim  W  —  (dim  V  —  dim  null  T) 

=  dim  null  T  +  dim  W  —  dim  V, 

where  the  first  equality  comes  from  (a),  the  second  equality  comes  from 
3.106,  and  the  third  equality  comes  from  the  Fundamental  Theorem  of 
Linear  Maps  (3.22).  ■ 

The  next  result  can  be  useful  because  sometimes  it  is  easier  to  verify  that 
Tf  is  injective  than  to  show  directly  that  T  is  surjective. 

3.108  T  surjective  is  equivalent  to  V  injective 

Suppose  V  and  W  are  finite-dimensional  and  T  e  C(V,  W).  Then  T  is 
surjective  if  and  only  if  Tr  is  injective. 

Proof  The  map  T  e  C{V,  W )  is  surjective  if  and  only  if  range  T  =  W, 
which  happens  if  and  only  if  (range  T)°  =  {0},  which  happens  if  and  only  if 
null  Tr  =  {0}  [by  3.107(a)],  which  happens  if  and  only  if  Tr  is  injective.  ■ 

3.109  The  range  of  Tl 

Suppose  V  and  W  are  finite-dimensional  and  T  e  C(V,  W).  Then 

(a)  dim  range  T'  =  dim  range  T ; 

(b)  range  T'  =  (null  T)°. 
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Proof 

(a)  We  have 


dim  range  T '  —  dim  Wr  —  dim  null  Tr 

=  dim  W  —  dim  (range  T)° 

=  dim  range  T, 

where  the  first  equality  comes  from  the  Fundamental  Theorem  of  Linear 
Maps  (3.22),  the  second  equality  comes  from  3.95  and  3.107(a),  and 
the  third  equality  comes  from  3.106. 

(b)  First  suppose  (p  e  range  Tf.  Thus  there  exists  \[r  e  W'  such  that 
(p  =  If  v  G  null  T,  then 

<p(v)  =  (T\f))v  =  (fo  T)(v)  =  VK Tv )  =  ^(0)  =  0. 

Hence  cp  e  (null  T)°.  This  implies  that  range  Tf  C  (null  T)°. 

We  will  complete  the  proof  by  showing  that  range  T '  and  (null  T)° 
have  the  same  dimension.  To  do  this,  note  that 

dim  range  Tr  =  dim  range  T 

=  dim  V  —  dim  null  T 
=  dim  (null  T)°, 

where  the  first  equality  comes  from  (a),  the  second  equality  comes  from 
the  Fundamental  Theorem  of  Linear  Maps  (3.22),  and  the  third  equality 
comes  from  3.106.  ■ 

The  next  result  should  be  compared  to  3.108. 

3.1  io  T  injective  is  equivalent  to  T '  surjective 

Suppose  V  and  W  are  finite-dimensional  and  T  e  C(V,  W).  Then  T  is 
injective  if  and  only  if  Tl  is  surjective. 

Proof  The  map  T  e  C(V,  W )  is  injective  if  and  only  if  null  T  =  {0}, 
which  happens  if  and  only  if  (null  T)°  =  V\  which  happens  if  and  only  if 
range  T '  =  V'  [by  3.109(b)],  which  happens  if  and  only  if  T'  is  surjective.  ■ 
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The  Matrix  of  the  Dual  of  a  Linear  Map 

We  now  define  the  transpose  of  a  matrix. 

3.111  Definition  transpose,  A{ 

The  transpose  of  a  matrix  A,  denoted  A\  is  the  matrix  obtained  from 
A  by  interchanging  the  rows  and  columns.  More  specifically,  if  A  is  an 
m-by-n  matrix,  then  A1  is  the  n-by-m  matrix  whose  entries  are  given  by 
the  equation 

(A  )k,j  — 


(  5  ~7 

3.112  Example  If  ,4  =  3  8 

V  -4  2 

Note  that  here  A  is  a  3-by-2  matrix  and  A 1  is  a  2-by-3  matrix. 

The  transpose  has  nice  algebraic  properties:  (A  +  C)1  =  A1  +  C{  and 
(XA)1  =  XAl  for  all  m-by-n  matrices  A,  C  and  all  X  e  F  (see  Exercise  33). 

The  next  result  shows  that  the  transpose  of  the  product  of  two  matrices  is 
the  product  of  the  transposes  in  the  opposite  order. 

3.113  The  transpose  of  the  product  of  matrices 
If  A  is  an  m-by-n  matrix  and  C  is  an  n-by-p  matrix,  then 

(AC)1  =  CXA\ 

Proof  Suppose  1  <  k  <  p  and  1  <  /  <  m.  Then 

(MC)%,  =  (AC),t 

n 

—  A  j,rCr,k 

r—  1 
n 

=  E(CV04%; 

r—  1 

=  (CxAx)kJ. 


,  then  A{  = 


Thus  ( ACy  =  CtAt ,  as  desired. 
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The  setting  for  the  next  result  is  the  assumption  that  we  have  a  basis 
vi , . . . ,  vn  of  V,  along  with  its  dual  basis  <p\ , . . . ,  cpn  of  V'.  We  also  have  a 
basis  wi , . . . ,  wm  of  W,  along  with  its  dual  basis  x/fi , . . . ,  \//m  of  Wf.  Thus 
M(T)  is  computed  with  respect  to  the  bases  just  mentioned  of  V  and  W, 
and  M(Tf )  is  computed  with  respect  to  the  dual  bases  just  mentioned  of  W' 
and  Vr. 

3.1 14  The  matrix  of  Tr  is  the  transpose  of  the  matrix  of  T 
Suppose  T  e  C(V,  W).  Then  M{T')  =  ( M{T))\ 

Proof  Let  A  =  M(T )  and  C  =  Suppose  1  <  j  <  m  and 

l  <  k  <  n. 

From  the  definition  of  A4(Tf)  we  have 

n 

T  (V^ j)  —  Crj  (pr . 

r—  1 

The  left  side  of  the  equation  above  equals  xf/j  o  T.  Thus  applying  both  sides 
of  the  equation  above  to  gives 


n 

O j  °  T)(vk)  =  Cr,j<Pr(vk) 

r—  1 


We  also  have 


(fj  °  T)(vk) 


tj  (T  vk) 


^(E 

i —  i 


Ar,k^r 


) 


m 

(Wr) 

r—  1 


Comparing  the  last  line  of  the  last  two  sets  of  equations,  we  have  C^  j  =  Aj^. 
Thus  C  =  A1.  In  other  words,  M(Tf )  =  (A^T))1,  as  desired.  ■ 
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The  Rank  of  a  Matrix 

We  begin  by  defining  two  nonnegative  integers  that  are  associated  with  each 
matrix. 


3.1 15  Definition  row  rank ,  column  rank 
Suppose  A  is  an  m-by-n  matrix  with  entries  in  F. 

•  The  row  rank  of  A  is  the  dimension  of  the  span  of  the  rows  of  A  in 
F1’”. 

•  The  column  rank  of  A  is  the  dimension  of  the  span  of  the  columns 
of  A  in  Fm?1. 


3.116  Example  Supposed 
and  the  column  rank  of  A. 


4  7  18 
3  5  2  9 


Find  the  row  rank  of  A 


Solution  The  row  rank  of  A  is  the  dimension  of 

span((  4  7  1  8), (3  5  2  9  )) 


in  F1,4.  Neither  of  the  two  vectors  listed  above  in  F1,4  is  a  scalar  multiple 
of  the  other.  Thus  the  span  of  this  list  of  length  2  has  dimension  2.  In  other 
words,  the  row  rank  of  A  is  2. 

The  column  rank  of  A  is  the  dimension  of 


in  F2,1 .  Neither  of  the  first  two  vectors  listed  above  in  F2,1  is  a  scalar  multiple 
of  the  other.  Thus  the  span  of  this  list  of  length  4  has  dimension  at  least  2. 
The  span  of  this  list  of  vectors  in  F2,1  cannot  have  dimension  larger  than  2 
because  dimF2,1  =  2.  Thus  the  span  of  this  list  has  dimension  2.  In  other 
words,  the  column  rank  of  A  is  2. 


Notice  that  no  bases  are  in  sight  in  the  statement  of  the  next  result.  Al¬ 
though  M(T)  in  the  next  result  depends  on  a  choice  of  bases  of  V  and  W, 
the  next  result  shows  that  the  column  rank  of  A 4(7")  is  the  same  for  all  such 
choices  (because  range  T  does  not  depend  on  a  choice  of  basis). 
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3.117  Dimension  of  ranger  equals  column  rank  of  M(T) 

Suppose  V  and  W  are  finite-dimensional  and  T  g  C(V,  W).  Then 
dim  range  T  equals  the  column  rank  of  M(T). 

Proof  Suppose  vi , . . . ,  vn  is  a  basis  of  V  and  w i , . . . ,  wm  is  a  basis  of  W.  The 
function  that  takes  w  G  span(T  v\, ...  ,T  vn)  to  M(yv)  is  easily  seen  to  be  an 
isomorphism  from  span(7Vi, . . . ,  Tvn)  onto  span(A/f(Tvi), . . .  ,Ai(Tvn)). 
Thus  dimspan(Tvi, . . . ,  Tvn)  =  dimspan(Al(rvi), . . . ,  M(Tvn )),  where 
the  last  dimension  equals  the  column  rank  of  A4(T). 

It  is  easy  to  see  that  range  T  =  span(T  v\, . . .  ,T  vn).  Thus  we  have 
dim  range  T  =  dim  span(T  vi , ...  ,Tvn)  =  the  column  rank  of  M  (T ),  as 
desired.  ■ 

In  Example  3. 1 16,  the  row  rank  and  column  rank  turned  out  to  equal  each 
other.  The  next  result  shows  that  this  always  happens. 

3.1 18  Row  rank  equals  column  rank 

Suppose  A  G  Fm,n.  Then  the  row  rank  of  A  equals  the  column  rank  of  A. 

Proof  Define  T :  Fw?1  — ►  Fm?1  by  Tx  =  Ax.  Thus  A4(T)  =  A ,  where 
M(T)  is  computed  with  respect  to  the  standard  bases  of  Fw,x  and  F 171,1 .  Now 

column  rank  of  A  =  column  rank  of  M(T) 

—  dim  range  T 
=  dim  range  Tf 

—  column  rank  of  M(Tr) 

=  column  rank  of  A{ 

=  row  rank  of  A, 

where  the  second  equality  above  comes  from  3.1 17,  the  third  equality  comes 
from  3.109(a),  the  fourth  equality  comes  from  3.117  (where  M(Tr )  is  com¬ 
puted  with  respect  to  the  dual  bases  of  the  standard  bases),  the  fifth  equality 
comes  from  3.1 14,  and  the  last  equality  follows  easily  from  the  definitions.  ■ 

The  last  result  allows  us  to  dispense  with  the  terms  “row  rank”  and  “column 
rank”  and  just  use  the  simpler  term  “rank”. 

3.119  Definition  rank 

The  rank  of  a  matrix  A  e  Fm,n  is  the  column  rank  of  A. 
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EXERCISES  3.F 


1  Explain  why  every  linear  functional  is  either  surjective  or  the  zero  map. 

2  Give  three  distinct  examples  of  linear  functionals  on  R^0’1! . 

3  Suppose  V  is  finite-dimensional  and  v  e  V  with  v  /  0.  Prove  that  there 
exists  <p  G  V'  such  that  <p(v)  =  1. 

4  Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V  such  that 
U  V.  Prove  that  there  exists  <p  e  V'  such  that  <p(u)  =  0  for  every 
u  G  U  but  /  0. 

5  Suppose  V\, ...  ,Vm  are  vector  spaces.  Prove  that  (V\  x  •  •  •  x  Vmy  and 
V\  x  •  •  •  x  Vmr  are  isomorphic  vector  spaces. 

6  Suppose  V  is  finite-dimensional  and  v\, . . . ,  vm  e  V.  Define  a  linear 
map  T  :  V'  — >  Fm  by 

r(<p)  =  (^p  (v  i ) , . .  .,<p(ym)). 


(a)  Prove  that  v\, ...  ,vm  spans  V  if  and  only  if  T  is  injective. 

(b)  Prove  that  v\ , . . . ,  vm  is  linearly  independent  if  and  only  if  T  is 
surjective. 

7  Suppose  m  is  a  positive  integer.  Show  that  the  dual  basis  of  the  basis 

1,  x, . . . ,  xm  of  Vm  (R)  is  cpo,  (p\, . . . ,  cpm,  where  cpj(p )  =  p  '■  .  Here 

denotes  the  j th  derivative  of  p ,  with  the  understanding  that  the  0th 
derivative  of  p  is  p. 

8  Suppose  m  is  a  positive  integer. 

(a)  Show  that  1 ,  x  —  5, . . . ,  (x  —  5)m  is  a  basis  of  Vm  (R). 

(b)  What  is  the  dual  basis  of  the  basis  in  part  (a)? 

9  Suppose  vi , . . . ,  vn  is  a  basis  of  V  and  cp\ , . . . ,  <pn  is  the  corresponding 
dual  basis  of  V'.  Suppose  \[r  e  V'.  Prove  that 

f  H - b  ty(yn)<pn. 


10  Prove  the  first  two  bullet  points  in  3. 101 . 
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11  Suppose  A  is  an  m-by-n  matrix  with  v4  /  0.  Prove  that  the  rank  of  A 
is  1  if  and  only  if  there  exist  (c\ , . . . ,  cm)  G  Fm  and  (d\ , . . . ,  dn)  G  Fw 
such  that  Ajk  =  Cjd \  for  every  j  =  1 , ...  ,m  and  every  k  =  1, ...  ,n. 

12  Show  that  the  dual  map  of  the  identity  map  on  E  is  the  identity  map 
on  V'. 

13  Define  T :  R3  ->  R2  by  T(x,  y,  z )  =  (4x  +  5 y  +  6 z,  lx  +  8y  +  9z). 
Suppose  ,  (^2  denotes  the  dual  basis  of  the  standard  basis  of  R2  and 

denotes  the  dual  basis  of  the  standard  basis  of  R3. 

(a)  Describe  the  linear  functionals  Tf(cp\)  and  T'iypj). 

(b)  Write  T\cpi)  and  T\ap2)  as  linear  combinations  of  i/fi,  \f/2,  ^3- 

14  Define  T :  7^(R)  — >  P(R)  by  ( Tp){x )  =  x2p(x)  +  //'(x)  for  x  G  R. 

(a)  Suppose  cp  G  )'  is  defined  by  <£>(/?)  =  p\ 4).  Describe  the 
linear  functional  rr(^)  on  P(R). 

(b)  Suppose  cp  G  ^(R)'  is  defined  by  ^(/?)  =  /J  p{x)  dx.  Evaluate 

(7»)(x3). 

15  Suppose  W  is  finite-dimensional  and  T  e  C(V,W).  Prove  that  T'  =  0 
if  and  only  if  T  =  0. 

16  Suppose  E  and  W  are  finite-dimensional.  Prove  that  the  map  that  takes 
T  G  C(V,  W)  to  Tf  g  C(Wf,  Vr)  is  an  isomorphism  of  C(V,  W)  onto 
C(W',V '). 

17  Suppose  U  C  V.  Explain  why  U°  =  {cp  e  V'  :  U  C  null<^}. 

18  Suppose  E  is  finite-dimensional  and  U  C  V.  Show  that  £/  =  {0}  if  and 
only  if  U°  =  E;. 

19  Suppose  E  is  finite-dimensional  and  £/  is  a  subspace  of  E.  Show  that 
U  =  E  if  and  only  if  f/°  =  {0}. 

20  Suppose  £/  and  IE  are  subsets  of  E  with  U  C  W.  Prove  that  W°  C  U°. 

21  Suppose  E  is  finite-dimensional  and  U  and  IE  are  subspaces  of  E  with 
IE0  C  U°.  Prove  that  U  C  W. 

22  Suppose  U,  W  are  subspaces  of  E.  Show  that  ( U  +  1E)°  =  U°  Pi  1E°. 
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23  Suppose  V  is  finite-dimensional  and  U  and  W  are  subspaces  of  V.  Prove 
that  (U  n  W)°  =  U°  +  W°. 

24  Prove  3.106  using  the  ideas  sketched  in  the  discussion  before  the  state¬ 
ment  of  3.106. 

25  Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  Show  that 

U  —  {v  G  V  :  (p{v)  =  0  for  every  <p  G  f/0}. 

26  Suppose  V  is  finite-dimensional  and  T  is  a  subspace  of  V'.  Show  that 

T  =  {v  G  V  :  cpiy)  =  0  for  every  cp  G  T}0. 

27  Suppose  T  g  £(Vs(R),  Vs(R))  and  null  T'  —  span((^),  where  cp  is 
the  linear  functional  on  ^(R)  defined  by  cp(p)  =  T>(8).  Prove  that 
ranged  =  {p  g  ^(R)  :  />(8)  =  0}. 

28  Suppose  V  and  W  are  finite-dimensional,  T  G  C (V,  W),  and  there  exists 
cp  G  Wl  such  that  null  Tf  =  span(^).  Prove  that  range  T  =  null<^. 

29  Suppose  V  and  W  are  finite-dimensional,  T  G  C (V,  W),  and  there  exists 
<p  G  V'  such  that  range  Tf  =  span((^).  Prove  that  null  T  =  null<^. 

30  Suppose  V  is  finite-dimensional  and  <p\ , . . . ,  (pm  is  a  linearly  independent 
list  in  V'.  Prove  that 

dim((null cp\)  n  •  •  •  Pi  (null (pm))  =  (dim  V)  —  m. 

31  Suppose  V  is  finite-dimensional  and  cp\ , . . . ,  cpn  is  a  basis  of  Vf.  Show 
that  there  exists  a  basis  of  V  whose  dual  basis  is  <p\ , . . . ,  <pn . 

32  Suppose  T  g  C(V ),  and  wi , . . .  ,un  and  vi , . . . ,  vw  are  bases  of  F.  Prove 
that  the  following  are  equivalent: 

(a)  T  is  invertible. 

(b)  The  columns  of  M(T)  are  linearly  independent  in  Fw?1 . 

(c)  The  columns  of  M(T)  span  Fn,x . 

(d)  The  rows  of  M(T)  are  linearly  independent  in  F1,n . 

(e)  The  rows  of  A4(T)  span  F1,n. 

Here  M(T)  means  A4(T,  (u\, . . . ,  un ),  (vi, . . . ,  un )). 
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33  Suppose  m  and  n  are  positive  integers.  Prove  that  the  function  that  takes 
A  to  A 1  is  a  linear  map  from  Fm,n  to  Fn,m.  Furthermore,  prove  that  this 
linear  map  is  invertible. 

34  The  double  dual  space  of  V,  denoted  V ",  is  defined  to  be  the  dual  space 
of  V'.  In  other  words,  V”  —  (Vf)r.  Define  A :  V  — >  V"  by 

(A  v)(<p)  =  <p(v) 

for  v  G  V  and  cp  e  V'. 

(a)  Show  that  A  is  a  linear  map  from  V  to  V". 

(b)  Show  that  if  T  £  C(V ),  then  T"  o  A  =  A  o  T,  where  T"  =  {Tr)r. 

(c)  Show  that  if  V  is  finite-dimensional,  then  A  is  an  isomorphism 
from  V  onto  V". 

[Suppose  V  is  finite-dimensional.  Then  V  and  V'  are  isomorphic,  but 
finding  an  isomorphism  from  V  onto  V'  generally  requires  choosing  a 
basis  of  V.  In  contrast,  the  isomorphism  A  from  V  onto  V"  does  not 
require  a  choice  of  basis  and  thus  is  considered  more  natural .] 

35  Show  that  ( V(R ))7  and  R°°  are  isomorphic. 

36  Suppose  U  is  a  subspace  of  V.  Let  / :  U  — >  V  be  the  inclusion  map 
defined  by  i  (u)  =  u.  Thus  V  e  C(V' ,  Ur). 

(a)  Show  that  null  i!  =  U°. 

(b)  Prove  that  if  V  is  finite-dimensional,  then  range  V  —  U'. 

(c)  Prove  that  if  V  is  finite-dimensional,  then  V  is  an  isomorphism 
from  V' /  U°  onto  Uf. 

[ The  isomorphism  in  part  (c)  is  natural  in  that  it  does  not  depend  on  a 
choice  of  basis  in  either  vector  space.] 

37  Suppose  U  is  a  subspace  of  V.  Let  n  :  V  — >  V /  U  be  the  usual  quotient 
map.  Thus  nf  €  C({V/U)f,  V'). 

(a)  Show  that  n'  is  injective. 

(b)  Show  that  range  7tf  =  U°. 

(c)  Conclude  that  n'  is  an  isomorphism  from  (V/Uf  onto  U°. 

[The  isomorphism  in  part  (c)  is  natural  in  that  it  does  not  depend  on  a 
choice  of  basis  in  either  vector  space.  In  fact,  there  is  no  assumption 
here  that  any  of  these  vector  spaces  are  finite-dimensional.] 
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algebra  book  written  in 
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serious  study  of  cubic 
polynomials. 
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Polynomials 


This  short  chapter  contains  material  on  polynomials  that  we  will  need  to 
understand  operators.  Many  of  the  results  in  this  chapter  will  already  be 
familiar  to  you  from  other  courses;  they  are  included  here  for  completeness. 

Because  this  chapter  is  not  about  linear  algebra,  your  instructor  may  go 
through  it  rapidly.  You  may  not  be  asked  to  scrutinize  all  the  proofs.  Make 
sure,  however,  that  you  at  least  read  and  understand  the  statements  of  all  the 
results  in  this  chapter — they  will  be  used  in  later  chapters. 

The  standing  assumption  we  need  for  this  chapter  is  as  follows: 

4.1  Notation  F 

F  denotes  R  or  C. 


LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  Division  Algorithm  for  Polynomials 

■  factorization  of  polynomials  over  C 

■  factorization  of  polynomials  over  R 
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Complex  Conjugate  and  Absolute  Value 

Before  discussing  polynomials  with  complex  or  real  coefficients,  we  need  to 
learn  a  bit  more  about  the  complex  numbers. 


4.2  Definition  Rez,  Imz 

Suppose  z  =  a  +  bi ,  where  a  and  b  are  real  numbers. 

•  The  real  part  of  z,  denoted  Re  z,  is  defined  by  Re  z  =  a. 

•  The  imaginary  part  of  z,  denoted  Imz,  is  defined  by  Imz  =  b. 

Thus  for  every  complex  number  z,  we  have 

z  =  Rez  +  (Imz)/. 


4.3  Definition  complex  conjugate ,  z ,  absolute  value ,  |z 
Suppose  z  g  C. 

•  The  complex  conjugate  of  z  g  C,  denoted  z,  is  defined  by 

z  =  Rez  —  (Imz)/. 


The  absolute  value  of  a  complex  number  z,  denoted  |z  | ,  is  defined 

by 

=  J  (Rez)2  +  (Imz)2. 


4.4  Example  Suppose  z  =  3  +  2/.  Then 
•  Rez  =  3  and  Imz  =  2; 


•  z  =  3  —  2/ ; 


V32  +  22  =  yi3. 


Note  that  |z|  is  a  nonnegative  number  for  every  z  G  C. 

The  real  and  imaginary  parts,  com¬ 
plex  conjugate,  and  absolute  value  have 
the  following  properties: 


You  should  verify  that  z  —  z  if  and 
only  if  z  is  a  real  number. 
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4.5  Properties  of  complex  numbers 
Suppose  w,  z  G  C.  Then 


sum  of  z  and  z 

z  +  z  =  2Rez; 

difference  of  z  and  z 

z  —  z  =  2(Im  z)i ; 

product  of  z  and  z 

i  i2 

zz  =  \z\  ; 

additivity  and  multiplicativity  of  complex  conjugate 

w  +  z  =  w  +  z  and  wz  =  wz; 


conjugate  of  conjugate 

z  =  z; 


real  and  imaginary  parts  are  bounded  by  |z 


Rez 


< 


z|  and  I  Imzl  <  |z 


absolute  value  of  the  complex  conjugate 


multiplicativity  of  absolute  value 

|wz|  =  | w|  Izl; 


Triangle  Inequality 

\w  +  z|  <  Iwl  + 


Proof  Except  for  the  last  item,  the  routine  verifications  of  the  assertions 
above  are  left  to  the  reader.  To  verify  the  last  item,  we  have 

w  +  z|2  =  (w  +  z)(w  +  z) 

=  ww  +  zz  +  wz  +  zw 


< 


w 

w 

w 


+ 

+ 

+ 


z 

z 

z 


=  Iwl  +  \z 


+  wz  +  wz 
2  +  2Re(wz) 
2  H-  2 1  wz  | 

2  +  2  w 


=  (M  +  \z\)2. 


Taking  the  square  root  of  both  sides  of  the  inequality  |w  +  z 
now  gives  the  desired  inequality. 


2 


< 


(M  +  \z\y 
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Uniqueness  of  Coefficients  for  Polynomials 

Recall  that  a  function  p :  F  — >  F  is  called  a  polynomial  with  coefficients  in  F 
if  there  exist  clq,  ...  ,am  E  F  such  that 

4.6  p(z )  =  ao  +  a\z  +  ci2Z2  +  •  •  •  +  amzm 

for  all  z  E  F. 

4.7  If  a  polynomial  is  the  zero  function,  then  all  coefficients  are  0 
Suppose  ao , . . . ,  am  E  F.  If 


ao  +  d\z  +  •  •  •  +  dynZ™  —  0 

for  every  zeF,  then  do  =  •  •  •  =  am  =  0. 


Proof  We  will  prove  the  contrapositive.  If  not  all  the  coefficients  are  0,  then 
by  changing  m  we  can  assume  am  0.  Let 


z  = 


\do\  +  \a\\  +  •••  +  \dm  —  i 


a 


m 


+  1. 


Note  that  z  >  1,  and  thus  zJ  <  zm  1  for  j  =  0, 1, . . . ,  m  —  1.  Using  the 
Triangle  Inequality,  we  have 


\do  +  a  \Z  +  •••  +  dm—\Z 


m— 1 


<  (|$0 1  +  |^1 1  +  •••  +  \dm—\  \)z 

<  I amzm 


m—1 


Thus  do  +  d\z  +  •••  +  am-\zm  1  ^  —amzm.  Hence  we  conclude  that 
do  +  a\z  H - h  am-\zm~x  +  amzm  ^  0.  ■ 


The  result  above  implies  that  the  coefficients  of  a  polynomial  are  uniquely 
determined  (because  if  a  polynomial  had  two  different  sets  of  coefficients, 
then  subtracting  the  two  representations  of  the  polynomial  would  give  a 
contradiction  to  the  result  above). 

Recall  that  if  a  polynomial  p  can  be  written  in  the  form  4.6  with  am  ^  0, 
then  we  say  that  p  has  degree  m  and  we  write  deg  p  =  m. 

The  degree  of  the  0  polynomial  is 
defined  to  be  — oo.  When  necessary,  use 
the  obvious  arithmetic  with  —  oo.  For 
example,  — oo  <  m  and  — oo  +  m  = 
— oo  for  every  integer  m. 


The  0  polynomial  is  declared  to 
have  degree  —  oo  so  that  excep¬ 
tions  are  not  needed  for  various 
reasonable  results.  For  example, 
d eg(pq)  —  deg  p  +  deg  q  even  if 

p  —  0. 
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The  Division  Algorithm  for  Polynomials 


If  p  and  s  are  nonnegative  integers,  with  s  ^  0,  then  there  exist  nonnegative 
integers  q  and  r  such  that 

p  =  sq  +  r 


Think  of  the  Division  Algorithm  for 
Polynomials  as  giving  the  remain¬ 
der  r  when  p  is  divided  hy  s. 


and  r  <  s.  Think  of  dividing  p  by  s,  getting  quotient  q  with  remainder  r.  Our 
next  task  is  to  prove  an  analogous  result  for  polynomials. 

The  result  below  is  often  called  the 
Division  Algorithm  for  Polynomials,  al¬ 
though  as  stated  here  it  is  not  really  an 
algorithm,  just  a  useful  result. 

Recall  that  V(F)  denotes  the  vector  space  of  all  polynomials  with  co¬ 
efficients  in  F  and  that  Vm(F)  is  the  subspace  of  V(F)  consisting  of  the 
polynomials  with  coefficients  in  F  and  degree  at  most  m. 

The  next  result  can  be  proved  without  linear  algebra,  but  the  proof  given 
here  using  linear  algebra  is  appropriate  for  a  linear  algebra  textbook. 


4.8  Division  Algorithm  for  Polynomials 

Suppose  that  p,s  e  'P(F),  with  s  ^  0.  Then  there  exist  unique 
polynomials  q,r  e  'P(F)  such  that 


p  =  sq  +  r 


and  degr  <  degs. 

Proof  Let  n  =  deg  p  and  m  =  deg  s.  If  n  <  m,  then  take  q  =  0  and  r  =  p 
to  get  the  desired  result.  Thus  we  can  assume  that  n  >  m. 

Define  T :  Vn-m(F)  x  Vm-i(F)  ->  Vn(F)  by 

T(q,  r)  =  sq  +  r. 

The  reader  can  easily  verify  that  T  is  a  linear  map.  If  (q,  r)  e  null  T,  then 
sq  +  r  =  0,  which  implies  that  q  —  0  and  r  =  0  [because  otherwise 
degsg  >  m  and  thus  sq  cannot  equal  — r].  Thus  dim  null  T  —  0  (proving  the 
“unique”  part  of  the  result). 

From  3.76  we  have 

dim(Pn-m (F)  x  =  (n  -  m  +  1)  +  (m  -  1  +  1)  =  n  +  1. 

The  Fundamental  Theorem  of  Linear  Maps  (3.22)  and  the  equation  displayed 
above  now  imply  that  dim  range  T  =  n  +  1,  which  equals  dimT^F).  Thus 
range  T  =  Vn(P),  and  hence  there  exist  q  e  Vn-m (F)  and  r  e  Vm-\  (F) 
such  that  p  =  T(q,  r)  =  sq  +  r.  ■ 
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Zeros  of  Polynomials 

The  solutions  to  the  equation  p(z)  =  0  play  a  crucial  role  in  the  study  of  a 
polynomial  p  G  V(F).  Thus  these  solutions  have  a  special  name. 

4.9  Definition  zero  of  a  polynomial 

A  number  A  g  F  is  called  a  zero  (or  root)  of  a  polynomial  p  G  P(F)  if 

;?(A)  =  0. 


4.10  Definition  factor 

A  polynomial  s  e  V(F)  is  called  a  factor  of  p  G  'P(F)  if  there  exists  a 
polynomial  g  G  P(F)  such  that  p  =  sq. 

We  begin  by  showing  that  A  is  a  zero  of  a  polynomial  p  e  V(F)  if  and 
only  if  z  —  A  is  a  factor  of  p. 

4.1 1  Each  zero  of  a  polynomial  corresponds  to  a  degree-1  factor 

Suppose  p  G  V(F)  and  A  G  F.  Then  p( A)  =  0  if  and  only  if  there  is  a 
polynomial  q  e  V( F)  such  that 

p{z)  =  (z-k)q(z) 


for  every  zgF. 

Proof  One  direction  is  obvious.  Namely,  suppose  there  is  a  polynomial 
q  G  V(F)  such  that  p{z)  =  (z  —  A )q(z)  for  all  z  G  F.  Then 

p(  A)  =  (A  —  A)g(A)  =  0, 

as  desired. 

To  prove  the  other  direction,  suppose  p( A)  =  0.  The  polynomial  z  —  A 
has  degree  1.  Because  a  polynomial  with  degree  less  than  1  is  a  constant 
function,  the  Division  Algorithm  for  Polynomials  (4.8)  implies  that  there  exist 
a  polynomial  q  G  V(F)  and  a  number  r  G  F  such  that 

p(z)  =  (z-X)q(z)  +  r 

for  every  zgF.  The  equation  above  and  the  equation  p(  A)  =  0  imply  that 
r  =  0.  Thus  p(z)  =  (z  —  A)g(z)  for  every  z  G  F.  ■ 

Now  we  can  prove  that  polynomials  do  not  have  too  many  zeros. 
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4.12  A  polynomial  has  at  most  as  many  zeros  as  its  degree 

Suppose  p  G  V(F)  is  a  polynomial  with  degree  m  >  0.  Then  p  has  at 
most  m  distinct  zeros  in  F. 

Proof  If  m  =  0,  then  p(z)  =  ao  ^  0  and  so  p  has  no  zeros. 

If  m  —  1,  then  p(z)  =  ao  +  a\z,  with  a\  0,  and  thus  p  has  exactly 
one  zero,  namely,  —ao/a\. 

Now  suppose  m  >  1.  We  use  induction  on  m,  assuming  that  every 
polynomial  with  degree  m  —  1  has  at  most  m  —  1  distinct  zeros.  If  p  has  no 
zeros  in  F,  then  we  are  done.  If  p  has  a  zero  X  G  F,  then  by  4.1 1  there  is  a 
polynomial  q  such  that 


p(z)  =  (z-X)q(z) 

for  all  z  G  F.  Clearly  degq  =  m  —  1.  The  equation  above  shows  that  if 
p(z)  =  0,  then  either  z  =  X  or  q(z)  =  0.  In  other  words,  the  zeros  of  p 
consist  of  X  and  the  zeros  of  q.  By  our  induction  hypothesis,  q  has  at  most 
m  —  1  distinct  zeros  in  F.  Thus  p  has  at  most  m  distinct  zeros  in  F.  ■ 

Factorization  of  Polynomials  over  C 

So  far  we  have  been  handling  polynomials  with  complex  coefficients  and 
polynomials  with  real  coefficients  simultaneously  through  our  convention  that 
F  denotes  R  or  C.  Now  we  will  see  some  differences  between  these  two  cases. 
First  we  treat  polynomials  with  complex  coefficients.  Then  we  will  use  our 
results  about  polynomials  with  complex  coefficients  to  prove  corresponding 
results  for  polynomials  with  real  coefficients. 

The  next  result,  although  called  the  77^  fundamental  Theorem  of  Al- 
Fundamental  Theorem  of  Algebra,  uses  gebra  is  an  existence  theorem.  Its 
analysis  its  proof.  The  short  proof  pre-  proof  does  not  lead  to  a  method  for 
sented  here  uses  tools  from  complex  finding  zeros.  The  quadratic  for- 
analysis.  If  you  have  not  had  a  course  in  gives  the  zeros  explicitly  for 

complex  analysis,  this  proof  will  almost  Polynomials  of  degree  2.  Similar 

.  ,  .  .  ,  ,  T  .  but  more  complicated  formulas  ex- 

certainly  be  meaningless  to  you.  In  that  .  r 0  , 

°  ^  istjor  polynomials  oj  degree  3  and 

case,  just  accept  the  Fundamental  The-  4  No  mch  formulas  eMSt  for  pdy_ 

orem  of  Algebra  as  something  that  we  nomials  of  degree  5  and  above. 

need  to  use  but  whose  proof  requires 
more  advanced  tools  that  you  may  learn 
in  later  courses. 


The  Fundamental  Theorem  of  Al¬ 
gebra  is  an  existence  theorem.  Its 
proof  does  not  lead  to  a  method  for 
finding  zeros.  The  quadratic  for¬ 
mula  gives  the  zeros  explicitly  for 
polynomials  of  degree  2.  Similar 
but  more  complicated  formulas  ex¬ 
ist  for  polynomials  of  degree  3  and 
4.  No  such  formulas  exist  for  poly¬ 
nomials  of  degree  5  and  above. 
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4.13  Fundamental  Theorem  of  Algebra 

Every  nonconstant  polynomial  with  complex  coefficients  has  a  zero. 

Proof  Let  p  be  a  nonconstant  polynomial  with  complex  coefficients.  Sup¬ 
pose  p  has  no  zeros.  Then  1/ p  is  an  analytic  function  on  C.  Furthermore, 
\p{z)\  ->  oo  as  |z|  — >  oo,  which  implies  that  1/ p  ->  0  as  |z|  — >  oo.  Thus 
1/p  is  a  bounded  analytic  function  on  C.  By  Liouville’s  theorem,  every  such 
function  is  constant.  But  if  1  / p  is  constant,  then  p  is  constant,  contradicting 
our  assumption  that  p  is  nonconstant.  ■ 

Although  the  proof  given  above  is  probably  the  shortest  proof  of  the 
Fundamental  Theorem  of  Algebra,  a  web  search  can  lead  you  to  several  other 
proofs  that  use  different  techniques.  All  proofs  of  the  Fundamental  Theorem 
of  Algebra  need  to  use  some  analysis,  because  the  result  is  not  true  if  C  is 
replaced,  for  example,  with  the  set  of  numbers  of  the  form  c  +  di  where  c ,  d 
are  rational  numbers. 

Remarkably,  mathematicians  have 
proved  that  no  formula  exists  for  the  ze¬ 
ros  of  polynomials  of  degree  5  or  higher. 
But  computers  and  calculators  can  use 
clever  numerical  methods  to  find  good 
approximations  to  the  zeros  of  any  poly¬ 
nomial,  even  when  exact  zeros  cannot 
be  found. 

For  example,  no  one  will  ever  be 
able  to  give  an  exact  formula  for  a  zero 
of  the  polynomial  p  defined  by 

p(x)  =  x5— 5x4— 6x3  +  17x2+4x— 7. 

However,  a  computer  or  symbolic  cal¬ 
culator  can  find  approximate  zeros  of 
this  polynomial. 

The  Fundamental  Theorem  of  Alge¬ 
bra  leads  to  the  following  factorization 
result  for  polynomials  with  complex  co¬ 
efficients.  Note  that  in  this  factorization, 
the  numbers  Ai, . . . ,  Xm  are  precisely 
the  zeros  of  p ,  for  these  are  the  only 
values  of  z  for  which  the  right  side  of 
the  equation  in  the  next  result  equals  0. 


The  cubic  formula,  which  was 
discovered  in  the  16th  century, 
is  presented  below  for  your 
amusement  only.  Do  not  memorize 
it. 

Suppose 

p(x)  —  ax  +  bx  +  cx  +  d, 
where  a  0.  Set 


9  abc  —  2b3  —  21  a2  d 


u  — 


54a3 


and  then  set 


v  =  u  + 


( 


3 ac  — b2 \ 3 


9  a2 


) 


Suppose  v  >  0.  Then 


—  — - f-  \!  u  +  ^/v  +  \/  u  —  \Jv 

3  a 


is  a  zero  of  p. 
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4.14  Factorization  of  a  polynomial  over  C 

If  p  G  V(C)  is  a  nonconstant  polynomial,  then  p  has  a  unique  factoriza¬ 
tion  (except  for  the  order  of  the  factors)  of  the  form 

p(z)  =  c(z  -  -  Am), 

where  c,  X\, . . . ,  Xm  g  C. 

Proof  Let  p  G  P(C)  and  let  m  =  deg  p.  We  will  use  induction  on  m.  If 
m  =  1,  then  clearly  the  desired  factorization  exists  and  is  unique.  So  assume 
that  m  >  1  and  that  the  desired  factorization  exists  and  is  unique  for  all 
polynomials  of  degree  m  —  1 . 

First  we  will  show  that  the  desired  factorization  of  p  exists.  By  the 
Fundamental  Theorem  of  Algebra  (4.13),  p  has  a  zero  A.  By  4.1 1,  there  is  a 
polynomial  q  such  that 


p(z)  =  (z  -  X)q(z) 

for  all  z  G  C.  Because  degg  =  m  —  1,  our  induction  hypothesis  implies  that 
q  has  the  desired  factorization,  which  when  plugged  into  the  equation  above 
gives  the  desired  factorization  of  p. 

Now  we  turn  to  the  question  of  uniqueness.  Clearly  c  is  uniquely  deter¬ 
mined  as  the  coefficient  of  zm  in  p.  So  we  need  only  show  that  except  for  the 
order,  there  is  only  one  way  to  choose  X\ , . . . ,  Xm.  If 

(z  —  Ai)  •  •  •  (z  —  Xm)  =  (z  —  ri)  •  •  •  (z  —  xm) 


for  all  z  G  C,  then  because  the  left  side  of  the  equation  above  equals  0  when 
z  =  Ai,  one  of  the  r’s  on  the  right  side  equals  X\.  Relabeling,  we  can  assume 
that  x\  =  Ai.  Now  for  z  ^  Ai,  we  can  divide  both  sides  of  the  equation 
above  by  z  —  A  \ ,  getting 


(z  —  X2)  •  •  •  (z  —  Xm)  —  (z  —  X2)  •  •  •  (z  —  xm) 


for  all  z  G  C  except  possibly  z  =  X\.  Actually  the  equation  above  holds 
for  all  z  G  C,  because  otherwise  by  subtracting  the  right  side  from  the  left 
side  we  would  get  a  nonzero  polynomial  that  has  infinitely  many  zeros.  The 
equation  above  and  our  induction  hypothesis  imply  that  except  for  the  order, 
the  A’s  are  the  same  as  the  r’s,  completing  the  proof  of  uniqueness.  ■ 
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Factorization  of  Polynomials  over  R 

A  polynomial  with  real  coefficients  may 
have  no  real  zeros.  For  example,  the 
polynomial  1  +  x2  has  no  real  zeros. 

To  obtain  a  factorization  theorem 
over  R,  we  will  use  our  factorization 
theorem  over  C.  We  begin  with  the  fol¬ 
lowing  result. 

4.15  Polynomials  with  real  coefficients  have  zeros  in  pairs 

Suppose  p  G  V(C)  is  a  polynomial  with  real  coefficients.  If  A  G  C  is  a 
zero  of  p ,  then  so  is  A. 

Proof  Let 

p(z)  =  a0  +  a\z  H - 1-  amzm , 

where  ao, . . . ,  am  are  real  numbers.  Suppose  A  G  C  is  a  zero  of  p.  Then 

ao  +  aiA  +  •  •  •  +  amXm  —  0. 

Take  the  complex  conjugate  of  both  sides  of  this  equation,  obtaining 

ao  -b  &\X  +  •  •  •  +  amXm  =  0, 

where  we  have  used  basic  properties  of  complex  conjugation  (see  4.5).  The 
equation  above  shows  that  A  is  a  zero  of  p.  m 

We  want  a  factorization  theorem  for 
polynomials  with  real  coefficients.  First 
we  need  to  characterize  the  polynomi¬ 
als  of  degree  2  with  real  coefficients 
that  can  be  written  as  the  product  of 
two  polynomials  of  degree  1  with  real 
coefficients. 

4.16  Factorization  of  a  quadratic  polynomial 

Suppose  b.  c  g  R.  Then  there  is  a  polynomial  factorization  of  the  form 

x  +  bx  +  c  =  (x  —  Ai)(x  —  A2) 
with  Ai ,  A2  G  R  if  and  only  if  b 2  >  4c. 


Think  about  the  connection  be¬ 
tween  the  quadratic  formula  and 
4.16. 


The  failure  of  the  Fundamental 
Theorem  of  Algebra  for  R  accounts 
for  the  differences  between  oper¬ 
ators  on  real  and  complex  vector 
spaces,  as  we  will  see  in  later 
chapters. 
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Proof  Notice  that 

X 2  hx  -p  C  —  ^X 

First  suppose  b2  <  4c.  Then  clearly 
the  right  side  of  the  equation  above  is 
positive  for  every  x  e  R.  Hence  the 
polynomial  x2  +  bx  +  c  has  no  real 
zeros  and  thus  cannot  be  factored  in  the 
form  (x  —  A  i ) (x  —  A  2)  with  Ai,  A  2  €  R. 

Conversely,  now  suppose  b2  >  4c. 


The  equation  above  is  the  basis 
of  the  technique  called  completing 
the  square. 

Then  there  is  a  real  number  d  such 


that  d2  =  K-  —  c.  From  the  displayed  equation  above,  we  have 


x2  +  bx  +  c  = 


(x  + 
(x  + 


2 


which  gives  the  desired  factorization. 


The  next  result  gives  a  factorization  of  a  polynomial  over  R.  The  idea  of 
the  proof  is  to  use  the  factorization  4.14  of  p  as  a  polynomial  with  complex 
coefficients.  Complex  but  nonreal  zeros  of  p  come  in  pairs;  see  4.15.  Thus 
if  the  factorization  of  p  as  an  element  of  V(C)  includes  terms  of  the  form 
(x  —  A)  with  A  a  nonreal  complex  number,  then  (x  —  A)  is  also  a  term  in  the 
factorization.  Multiplying  together  these  two  terms,  we  get 


2(Re  A)x  +  |  A  | 


which  is  a  quadratic  term  of  the  required  form. 

The  idea  sketched  in  the  paragraph  above  almost  provides  a  proof  of  the 
existence  of  our  desired  factorization.  However,  we  need  to  be  careful  about 
one  point.  Suppose  A  is  a  nonreal  complex  number  and  (x  —  A)  is  a  term  in 
the  factorization  of  p  as  an  element  of  V(C).  We  are  guaranteed  by  4.15  that 
(x  —  A)  also  appears  as  a  term  in  the  factorization,  but  4.15  does  not  state  that 
these  two  factors  appear  the  same  number  of  times,  as  needed  to  make  the 
idea  above  work.  However,  the  proof  works  around  this  point. 

In  the  next  result,  either  m  or  M  may  equal  0.  The  numbers  Ai , . . . ,  Xm 
are  precisely  the  real  zeros  of  p ,  for  these  are  the  only  real  values  of  x  for 
which  the  right  side  of  the  equation  in  the  next  result  equals  0. 
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4.17  Factorization  of  a  polynomial  over  R 

Suppose  p  e  V(R)  is  a  nonconstant  polynomial.  Then  p  has  a  unique 
factorization  (except  for  the  order  of  the  factors)  of  the  form 

p(x)  =  c(x  -  Ai)  •  •  •  (x  -  Xm)(x2  +  b IX  +  Cl)  •  •  •  (x2  +  bMX  +  cm), 

where  c,  Ai, . . . ,  Xm,  b\, . . . ,  bM,  c\,  • . . ,  cm  c  R,  with  bj  <  4c j  for 
each  j . 

Proof  Think  of  p  as  an  element  of  V(C).  If  all  the  (complex)  zeros  of  p  are 
real,  then  we  are  done  by  4.14.  Thus  suppose  p  has  a  zero  A  e  C  with  A  ^  R. 
By  4.15,  A  is  a  zero  of  p.  Thus  we  can  write 

p(x)  =  (x  —  A)(x  —  A  )q(x) 

=  (x2  —  2(ReA)x  +  |A|2)g(x) 

for  some  polynomial  q  e  V(C)  with  degree  two  less  than  the  degree  of  p. 
If  we  can  prove  that  q  has  real  coefficients,  then  by  using  induction  on  the 
degree  of  p,  we  can  conclude  that  (x  —  A)  appears  in  the  factorization  of  p 
exactly  as  many  times  as  (x  —  A). 

To  prove  that  q  has  real  coefficients,  we  solve  the  equation  above  for  q , 
getting 

,  ,  p{x) 

qx)  x2  —  2(Re  A)x  +  |A|2 

for  all  x  e  R.  The  equation  above  implies  that  q(x)  e  R  for  all  x  e  R. 
Writing 

q (x)  =  ao  +  ci\x  4 - b  an- 2xn~2, 

where  n  =  deg  p  and  ao , . . . ,  an- 2  €  C,  we  thus  have 

0  =  Irru/(x)  =  (Im^o)  +  (lma\)x  +  •  •  •  +  (Im^_2)x”“2 

for  all  x  e  R.  This  implies  that  Imao, . . . ,  lman-2  all  equal  0  (by  4.7).  Thus 
all  the  coefficients  of  q  are  real,  as  desired.  Hence  the  desired  factorization 
exists. 

Now  we  turn  to  the  question  of  uniqueness  of  our  factorization.  A  factor 
of  p  of  the  form  x2  +  bjx  +  cj  with  bj 2  <  4 Cj  can  be  uniquely  written 
as  (x  —  A7)(x  —  A  j)  with  A  j  e  C.  A  moment’s  thought  shows  that  two 
different  factorizations  of  p  as  an  element  of  ^(R)  would  lead  to  two  different 
factorizations  of  p  as  an  element  of  'P(C),  contradicting  4.14.  ■ 
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EXERCISES  4 


1  Verify  all  the  assertions  in  4.5  except  the  last  one. 

2  Suppose  m  is  a  positive  integer.  Is  the  set 

{0}  U  {p  e  V(F)  :  deg  p  =  m) 

a  sub  space  of  V(F)1 

3  Is  the  set 

{0}  U  {p  g  V(F)  :  deg  p  is  even} 
a  sub  space  of  'P(F)? 

4  Suppose  m  and  n  are  positive  integers  with  m  <  and  suppose 
Ai, . . . ,  Xm  G  F.  Prove  that  there  exists  a  polynomial  p  G  V(F)  with 
deg  p  =  n  such  that  0  =  p(X\)  =  •  •  •  =  p( Xm)  and  such  that  p  has  no 
other  zeros. 

5  Suppose  m  is  a  nonnegative  integer,  z\ , . . . ,  zm+i  are  distinct  elements 

of  F,  and  w i , . . . ,  G  F.  Prove  that  there  exists  a  unique  polynomial 

p  c  Pm(F)  such  that 

p(zy)  =  Wy 

for  j  =  1 , . . . ,  m  +  1 . 

[77z/v  proved  without  using  linear  algebra.  However,  try  to 

find  the  clearer,  shorter  proof  that  uses  some  linear  algebra .] 

6  Suppose  p  G  P(C)  has  degree  m.  Prove  that  p  has  m  distinct  zeros  if 
and  only  if  p  and  its  derivative  p'  have  no  zeros  in  common. 

7  Prove  that  every  polynomial  of  odd  degree  with  real  coefficients  has  a 
real  zero. 

8  Define  T :  7^(R)  — ►  Rr  by 


r  P  -  Pi 3) 
<  x  —  3 

/(  3) 


if  x  3, 
if  x  =  3. 


Show  that  Tp  G  ^(R)  for  every  polynomial  p  G  7^(R)  and  that  T  is  a 
linear  map. 
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9  Suppose  p  G  V(C).  Define  q:  C  ->  C  by 

q(z)  =  p{z)p(z). 

Prove  that  q  is  a  polynomial  with  real  coefficients. 

10  Suppose  m  is  a  nonnegative  integer  and  p  G  Vm(C )  is  such  that  there 
exist  distinct  real  numbers  Vo,  xi, . . . ,  xm  such  that  p(xj)  G  R  for 
j  =  0, 1, . . . ,  m.  Prove  that  all  the  coefficients  of  p  are  real. 

11  Suppose  p  G  V(F)  with  /  0.  Let  U  =  {pq  :  g  G  P(F)}. 

(a)  Show  that  dimV(F)/U  =  deg  p. 

(b)  Find  a  basis  of  dim  V(F)  /  U. 
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Statue  of  Italian  mathematician 
Leonardo  of  Pisa  ( 1170-1250 , 
approximate  dates),  also  known  as 
Fibonacci.  Exercise  16  in  Section  5.C 
shows  how  linear  algebra  can  be  used 
to  find  an  explicit  formula  for  the 
Fibonacci  sequence. 


Eigenvalues ,  Eigenvectors,  and 
Invariant  Subspaces 


Linear  maps  from  one  vector  space  to  another  vector  space  were  the  objects 
of  study  in  Chapter  3.  Now  we  begin  our  investigation  of  linear  maps  from 
a  finite-dimensional  vector  space  to  itself.  Their  study  constitutes  the  most 
important  part  of  linear  algebra. 

Our  standing  assumptions  are  as  follows: 

5.1  Notation  F,  V 

•  F  denotes  R  or  C. 

•  V  denotes  a  vector  space  over  F. 


LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  invariant  subspaces 

■  eigenvalues,  eigenvectors,  and  eigenspaces 

■  each  operator  on  a  finite-dimensional  complex  vector  space  has  an 
eigenvalue  and  an  upper-triangular  matrix  with  respect  to  some 
basis 
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5.A 


Invariant  Subspaces 


In  this  chapter  we  develop  the  tools  that  will  help  us  understand  the  structure 
of  operators.  Recall  that  an  operator  is  a  linear  map  from  a  vector  space  to 
itself.  Recall  also  that  we  denote  the  set  of  operators  on  V  by  C(V);  in  other 
words,  C(V)  =  C(V,  V ). 

Let’s  see  how  we  might  better  understand  what  an  operator  looks  like. 
Suppose  T  g  C(V).  If  we  have  a  direct  sum  decomposition 


V  —  U\  ©  •  •  •  ©  Um, 


where  each  Uj  is  a  proper  subspace  of  V,  then  to  understand  the  behavior  of 
r,  we  need  only  understand  the  behavior  of  each  T  \  jjj ;  here  T  \  jjj  denotes 
the  restriction  of  T  to  the  smaller  domain  Uj .  Dealing  with  T  \  jjj  should  be 
easier  than  dealing  with  T  because  Uj  is  a  smaller  vector  space  than  V. 

However,  if  we  intend  to  apply  tools  useful  in  the  study  of  operators  (such 
as  taking  powers),  then  we  have  a  problem:  T  \  pj  may  not  map  Uj  into  itself; 
in  other  words,  T \ jjj  may  not  be  an  operator  on  Uj.  Thus  we  are  led  to 
consider  only  decompositions  of  V  of  the  form  above  where  T  maps  each  Uj 
into  itself. 

The  notion  of  a  subspace  that  gets  mapped  into  itself  is  sufficiently  impor¬ 
tant  to  deserve  a  name. 


5.2  Definition  invariant  subspace 

Suppose  T  g  C(V).  A  subspace  U  of  V  is  called  invariant  under  T  if 
u  G  U  implies  Tu  G  U. 


In  other  words,  U  is  invariant  under  T  if  T\u  is  an  operator  on  U. 


5.3  Example  Suppose  T  G  C(V).  Show  that  each  of  the  following 
subspaces  of  V  is  invariant  under  T : 


(a) 

{0}; 

The  most  famous  unsolved  problem 
in  functional  analysis  is  called  the 

(b) 

V; 

invariant  subspace  problem.  It 

deals  with  invariant  subspaces  of 

(c) 

(d) 

null  T ; 

range  T. 

operators  on  infinite-dimensional 
vector  spaces. 
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Solution 

(a)  If  u  e  {0},  then  u  =  0  and  hence  Tu  —  0  e  {0}.  Thus  {0}  is  invariant 
under  T. 

(b)  IfueV,  then  Tu  e  V.  Thus  V  is  invariant  under  T. 

(c)  If  u  e  null  T,  then  Tu  =  0,  and  hence  Tu  e  null  T.  Thus  null  T  is 
invariant  under  T. 

(d)  If  u  e  range  T,  then  Tu  e  range  T.  Thus  range  T  is  invariant  under  T. 


Must  an  operator  T  e  V)  have  any  invariant  subspaces  other  than  {0} 
and  V  ?  Later  we  will  see  that  this  question  has  an  affirmative  answer  if  V  is 
finite-dimensional  and  dim  V  >  1  (for  F  =  C)  or  dim  V  >  2  (for  F  =  R); 
see  5.21  and  9.8. 

Although  null  T  and  range  T  are  invariant  under  T,  they  do  not  necessarily 
provide  easy  answers  to  the  question  about  the  existence  of  invariant  subspaces 
other  than  {0}  and  V,  because  null  T  may  equal  {0}  and  range  T  may  equal 
V  (this  happens  when  T  is  invertible). 

5.4  Example  Suppose  that  T  e  ^(^(R))  is  defined  by  Tp  —  pf. 
Then  which  is  a  subspace  of  V(R ),  is  invariant  under  T  because 

if  p  e  V(R)  has  degree  at  most  4,  then  pf  also  has  degree  at  most  4. 


Eigenvalues  and  Eigenvectors 

We  will  return  later  to  a  deeper  study  of  invariant  subspaces.  Now  we  turn  to  an 
investigation  of  the  simplest  possible  nontrivial  invariant  subspaces — invariant 
subspaces  with  dimension  1 . 

Take  any  v  e  V  with  v^0  and  let  U  equal  the  set  of  all  scalar  multiples 
of  v: 

U  =  {Av  :  A  G  F}  =  span(v). 

Then  U  is  a  1 -dimensional  subspace  of  V  (and  every  1 -dimensional  subspace 
of  V  is  of  this  form  for  an  appropriate  choice  of  v).  If  U  is  invariant  under  an 
operator  T  e  C(V),  then  Tv  e  U,  and  hence  there  is  a  scalar  A  e  F  such  that 

Tv  =  Av. 

Conversely,  if  Tv  =  Av  for  some  A  e  F,  then  span(v)  is  a  1 -dimensional 
subspace  of  V  invariant  under  T. 
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The  equation 

Tv  =  Av, 

which  we  have  just  seen  is  intimately  connected  with  1 -dimensional  invariant 
subspaces,  is  important  enough  that  the  vectors  v  and  scalars  A  satisfying  it 
are  given  special  names. 

5.5  Definition  eigenvalue 

Suppose  T  G  C(V).  A  number  A  G  F  is  called  an  eigenvalue  of  T  if 
there  exists  v  e  V  such  that  v/0  and  Tv  =  Av. 


The  word  eigenvalue  is  half- 
German,  half-English.  The  Ger¬ 
man  adjective  eigen  means  uown  ” 
in  the  sense  of  characterizing  an  in¬ 
trinsic  property.  Some  mathemati¬ 
cians  use  the  term  characteristic 
value  instead  of  eigenvalue. 


The  comments  above  show  that  T 
has  a  1 -dimensional  invariant  subspace 
if  and  only  if  T  has  an  eigenvalue. 

In  the  definition  above,  we  require 
that  v  ^  0  because  every  scalar  A  G  F 
satisfies  TO  =  AO. 


5.6  Equivalent  conditions  to  be  an  eigenvalue 


Suppose  V  is  finite-dimensional, 
following  are  equivalent: 

(a)  A  is  an  eigenvalue  of  T ; 

(b)  T  —  XI  is  not  injective; 

(c)  T  —  XI  is  not  surjective; 

(d)  T  —  XI  is  not  invertible. 


T  g  £(F),  and  A  G  F .  Then  the 


Recall  that  I  G  C{V)  is  the  iden¬ 
tity  operator  defined  by  I  v  —  v  for 
all  v  G  V. 

Proof  Conditions  (a)  and  (b)  are  equivalent  because  the  equation  Tv  —  Av 
is  equivalent  to  the  equation  ( T  —  A I)v  =  0.  Conditions  (b),  (c),  and  (d)  are 
equivalent  by  3.69.  ■ 


5.7  Definition  eigenvector 

Suppose  T  G  C(V )  and  A  G  F  is  an  eigenvalue  of  T.  A  vector  v  G  V  is 
called  an  eigenvector  of  T  corresponding  to  A  if  v  ^  0  and  Tv  =  Av. 
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Because  Tv  =  Av  if  and  only  if  (T  —  A/)v  =  0,  a  vector  v  G  V  with  v/0 
is  an  eigenvector  of  T  corresponding  to  A  if  and  only  if  v  G  null ( 7"  —  XI). 

5.8  Example  Suppose  T  G  F2)  is  defined  by 

T(w,  z)  =  (  z ,  w) . 

(a)  Find  the  eigenvalues  and  eigenvectors  of  T  if  F  =  R. 

(b)  Find  the  eigenvalues  and  eigenvectors  of  T  if  F  =  C. 

Solution 

(a)  IfF  =  R,  then  T  is  a  counterclockwise  rotation  by  90°  about  the 

origin  in  R2.  An  operator  has  an  eigenvalue  if  and  only  if  there  exists  a 

nonzero  vector  in  its  domain  that  gets  sent  by  the  operator  to  a  scalar 

multiple  of  itself.  A  90°  counterclockwise  rotation  of  a  nonzero  vector 
in  R2  obviously  never  equals  a  scalar  multiple  of  itself.  Conclusion:  if 
F  =  R,  then  T  has  no  eigenvalues  (and  thus  has  no  eigenvectors). 

(b)  To  find  eigenvalues  of  T,  we  must  find  the  scalars  X  such  that 

T(w,  z)  =  A(w ,  z) 

has  some  solution  other  than  w  =  z  =  0.  The  equation  above  is 
equivalent  to  the  simultaneous  equations 

5.9  — z  =  Aw,  w  =  Az. 

Substituting  the  value  for  w  given  by  the  second  equation  into  the  first 
equation  gives 

— z  =  A2z. 

Now  z  cannot  equal  0  [otherwise  5.9  implies  that  w  =  0;  we  are 
looking  for  solutions  to  5.9  where  (w,  z)  is  not  the  0  vector],  so  the 
equation  above  leads  to  the  equation 

-1  =  A2. 

The  solutions  to  this  equation  are  A  =  i  and  A  =  —  i .  You  should 
be  able  to  verify  easily  that  i  and  —i  are  eigenvalues  of  T.  Indeed, 
the  eigenvectors  corresponding  to  the  eigenvalue  i  are  the  vectors  of 
the  form  (w,  —wi),  with  w  G  C  and  w  ^  0,  and  the  eigenvectors 
corresponding  to  the  eigenvalue  —  i  are  the  vectors  of  the  form  (w,  wi), 
with  w  G  C  and  w  /  0. 


136 


CHAPTER  5  Eigenvalues,  Eigenvectors,  and  Invariant  Subspaces 


Now  we  show  that  eigenvectors  corresponding  to  distinct  eigenvalues  are 
linearly  independent. 

5.10  Linearly  independent  eigenvectors 

Let  T  g  C(V).  Suppose  Ai, . . . ,  Xm  are  distinct  eigenvalues  of  T  and 
v\, ...  ,vm  are  corresponding  eigenvectors.  Then  v\, ...  ,vm  is  linearly 
independent. 

Proof  Suppose  v\ , . . . ,  vm  is  linearly  dependent.  Let  k  be  the  smallest  posi¬ 
tive  integer  such  that 

5.11  vk  G  span(vi, . . . ,  v^-i); 

the  existence  of  k  with  this  property  follows  from  the  Linear  Dependence 
Lemma  (2.21).  Thus  there  exist  a\ , . . . ,  ak-\  G  F  such  that 

5.12  vk  =  a\vi  H - b  ak-Xvk-i. 

Apply  T  to  both  sides  of  this  equation,  getting 

A kvk  —  ^lAivi  +  •  •  •  +  ak-\Xk-\vk-\ . 

Multiply  both  sides  of  5.12  by  Xk  and  then  subtract  the  equation  above,  getting 

0  =  a  i  (Xk  —  Ai)vi  +  •  •  •  +  ak-i  (Xk  —  Xk-\)vk-\ . 

Because  we  chose  k  to  be  the  smallest  positive  integer  satisfying  5.11, 
v\ , . . . ,  vk—i  is  linearly  independent.  Thus  the  equation  above  implies  that  all 
the  a’s  are  0  (recall  that  Xk  is  not  equal  to  any  of  X\ , . . . ,  A^_x).  However,  this 
means  that  vk  equals  0  (see  5.12),  contradicting  our  hypothesis  that  vk  is  an 
eigenvector.  Therefore  our  assumption  that  v\, ...  ,vm  is  linearly  dependent 
was  false.  ■ 

The  corollary  below  states  that  an  operator  cannot  have  more  distinct 
eigenvalues  than  the  dimension  of  the  vector  space  on  which  it  acts. 

5.13  Number  of  eigenvalues 

Suppose  V  is  finite-dimensional.  Then  each  operator  on  V  has  at  most 
dim  V  distinct  eigenvalues. 

Proof  Let  T  G  C(V).  Suppose  Ai, . . . ,  Xm  are  distinct  eigenvalues  of  T. 
Let  vi , . . . ,  vm  be  corresponding  eigenvectors.  Then  5.10  implies  that  the  list 
vi, . . . ,  vm  is  linearly  independent.  Thus  m  <  dim  V  (see  2.23),  as  desired.  ■ 
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Restriction  and  Quotient  Operators 

If  T  G  C(V)  and  U  is  a  subspace  of  V  invariant  under  T,  then  U  determines 
two  other  operators  T\u  G  C(U)  and  T /U  G  C(V/U )  in  a  natural  way,  as 
defined  below. 


5.14  Definition  T\u  and  T/U 

Suppose  T  g  C(V)  and  U  is  a  subspace  of  V  invariant  under  T. 
•  The  restriction  operator  T\jj  e  C(U)  is  defined  by 


T\u{u)  =  Tu 


for  u  g  U. 


The  quotient  operator  T/U  e  C(V/U)  is  defined  by 

(77£/)(v  + t/)  =  Tv  +  U 


for  v  G  V. 


For  both  the  operators  defined  above,  it  is  worthwhile  to  pay  attention 
to  their  domains  and  to  spend  a  moment  thinking  about  why  they  are  well 
defined  as  operators  on  their  domains.  First  consider  the  restriction  operator 
T\u  G  C(U),  which  is  T  with  its  domain  restricted  to  *7,  thought  of  as 
mapping  into  U  instead  of  into  V.  The  condition  that  U  is  invariant  under  T 
is  what  allows  us  to  think  of  T  \  jj  as  an  operator  on  *7,  meaning  a  linear  map 
into  the  same  space  as  the  domain,  rather  than  as  simply  a  linear  map  from 
one  vector  space  to  another  vector  space. 

To  show  that  the  definition  above  of  the  quotient  operator  makes  sense, 
we  need  to  verify  that  if  v  +  U  =  w  +  *7,  then  Tv  +  U  =  Tw  +  U.  Hence 
suppose  v  +  U  =  w  +  U.  Thus  v  —  w  e  U  (see  3.85).  Because  U  is  invariant 
under  T,  we  also  have  T(v  —  w )  G  *7,  which  implies  that  Tv  —Tw  G  *7,  which 
implies  that  7+  +  *7  =  Tw  +  U,  as  desired. 

Suppose  T  is  an  operator  on  a  finite-dimensional  vector  space  V  and  U  is 
a  subspace  of  V  invariant  under  T,  with  U  ^  {0}  and  U  ^  V.  In  some  sense, 
we  can  learn  about  T  by  studying  the  operators  T  \  jj  and  T/U ,  each  of  which 
is  an  operator  on  a  vector  space  with  smaller  dimension  than  V.  For  example, 
proof  2  of  5.27  makes  nice  use  of  T/U. 

However,  sometimes  T  \  jj  and  T /  U  do  not  provide  enough  information 
about  T.  In  the  next  example,  both  T \ jj  and  T/U  are  0  even  though  T  is  not 
the  0  operator. 


138 


CHAPTER  5  Eigenvalues,  Eigenvectors,  and  Invariant  Subspaces 


5.15  Example  Define  an  operator  T  g  £( F1 2)  by  T{x,  y)  =  (y,  0).  Let 
U  =  {(x,  0)  :  x  G  F}.  Show  that 


(a)  U  is  invariant  under  T  and  T\u  is  the  0  operator  on  U ; 

(b)  there  does  not  exist  a  subspace  W  of  F2  that  is  invariant  under  T  and 
such  that  F 1  —  U  ®  IF ; 


(c)  T/U  is  the  0  operator  on  F2/  U. 


Solution 


For  (x,  0)  G  U,  we  have  T(x,  0)  =  (0,  0) 
under  T and  T\u  is  the  0  operator  on  U. 


G  U.  Thus  U  is  invariant 


(b)  Suppose  IF  is  a  subspace  of  V  such  that  F2  =  U  ©  W.  Because 
dim  F2  =  2  and  dim  U  —  1 ,  we  have  dim  W  =  1 .  If  IF  were  invariant 
under  T,  then  each  nonzero  vector  in  IF  would  be  an  eigenvector  of  T. 
However,  it  is  easy  to  see  that  0  is  the  only  eigenvalue  of  T  and  that  all 
eigenvectors  of  T  are  in  U.  Thus  IF  is  not  invariant  under  T. 


(c)  For  (x,  y)  G  F2,  we  have 


(T/U)((x,y)  +  U)  =  T(x,y)  +  U 

=  (y,0)  +  U 

=  o  +  u, 


where  the  last  equality  holds  because  (y,  0)  G  U.  The  equation  above 
shows  that  T/U  is  the  0  operator. 


EXERCISES  5. A 

1  Suppose  T  g  £(F)  and  U  is  a  subspace  of  V. 

(a)  Prove  that  if  U  C  null  T,  then  U  is  invariant  under  T. 

(b)  Prove  that  if  range  T  C  U,  then  U  is  invariant  under  T. 

2  Suppose  S,T  g  jC(V)  are  such  that  ST  =  TS.  Prove  that  nulls'  is 
invariant  under  T. 
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3  Suppose  S,T  e  C(V)  are  such  that  ST  =  TS.  Prove  that  range  S  is 
invariant  under  T. 

4  Suppose  that  T  e  C(V)  and  U\, . . . ,  Um  are  subspaces  of  V  invariant 
under  T.  Prove  that  U\  +  •  •  •  +  Um  is  invariant  under  T. 

5  Suppose  T  e  C(V).  Prove  that  the  intersection  of  every  collection  of 
subspaces  of  V  invariant  under  T  is  invariant  under  T. 

6  Prove  or  give  a  counterexample:  if  V  is  finite-dimensional  and  U  is  a 
subspace  of  V  that  is  invariant  under  every  operator  on  V,  then  U  =  {0} 
or  U  =  V. 

7  Suppose  T  e  >C(R2)  is  defined  by  T(x,y)  =  (—3y,x).  Find  the 
eigenvalues  of  T. 

8  Define  T  e  F2)  by 

T(w,z )  =  (z,w). 

Find  all  eigenvalues  and  eigenvectors  of  T. 

9  Define  T  e  >C(F3)  by 


r(zi,z2,z3)  =  (2z2,  0,  5z3). 

Find  all  eigenvalues  and  eigenvectors  of  T. 

10  Define  T  e  C{¥n)  by 

T (x\ ,  x2,  x3, . . . ,  Xyi)  =  (xi ,  2x2,  3x3,  . . . ,  TiXyi). 

(a)  Find  all  eigenvalues  and  eigenvectors  of  T. 

(b)  Find  all  invariant  subspaces  of  T. 

11  Define  T :  ^(R)  — ►  V(R )  by  Tp  —  pr.  Find  all  eigenvalues  and 
eigenvectors  of  T. 

12  Define  T  e  jC(V4( R))  by 


(Tp)(x)  =  xp\x) 

for  all  x  G  R.  Find  all  eigenvalues  and  eigenvectors  of  T. 

13  Suppose  V  is  finite-dimensional,  T  e  C(V),  and  A  e  F.  Prove  that  there 
exists  a  e  F  such  that  \a  —  X\  <  and  T  —  a  I  is  invertible. 
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14  Suppose  V  =  U  ®  W,  where  U  and  W  are  nonzero  subspaces  of  V. 
Define  P  G  C(V)  by  P(u  +  w)  =  u  for  u  e  U  and  w  e  W.  Find  all 
eigenvalues  and  eigenvectors  of  P . 

15  Suppose  T  G  C(V).  Suppose  S  G  C(V )  is  invertible. 

(a)  Prove  that  T  and  S~1TS  have  the  same  eigenvalues. 

(b)  What  is  the  relationship  between  the  eigenvectors  of  T  and  the 
eigenvectors  of  S~l  TS1 


16  Suppose  V  is  a  complex  vector  space,  T  g  C(V),  and  the  matrix  of  T 
with  respect  to  some  basis  of  V  contains  only  real  entries.  Show  that  if 
X  is  an  eigenvalue  of  T,  then  so  is  A. 

17  Give  an  example  of  an  operator  T  g  £(R4)  such  that  T  has  no  (real) 
eigenvalues. 

18  Show  that  the  operator  T  G  £(C°°)  defined  by 

T(z\ ,  Z2?  •  •  • )  (0?  ^T  ?  ^2 9  •  •  • ) 


has  no  eigenvalues. 

19  Suppose  n  is  a  positive  integer  and  T  G  C{¥n)  is  defined  by 


T{x 


•  •  • 


Xn)  —  C^l  +  *  *  *  +  %n  9  •  •  •  9  %1  +  *  *  *  +  Xn)\ 


in  other  words,  T  is  the  operator  whose  matrix  (with  respect  to  the 
standard  basis)  consists  of  all  l’s.  Find  all  eigenvalues  and  eigenvectors 
of  T. 


20  Find  all  eigenvalues  and  eigenvectors  of  the  backward  shift  operator 
T  g  £(F°°)  defined  by 


P  (z  1  9  ^2  9  ^3  9  •  •  •  ) 


C^2 9  ^3  9  • • • )  • 


21  Suppose  T  G  C{V )  is  invertible. 


(a)  Suppose  X  G  F  with  A  ^  0.  Prove  that  X  is  an  eigenvalue  of  T  if 
and  only  if  j  is  an  eigenvalue  of  T~l . 

(b)  Prove  that  T  and  T~l  have  the  same  eigenvectors. 
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22 


23 

24 


26 

27 


29 

30 

31 


Suppose  T  G  C(V)  and  there  exist  nonzero  vectors  v  and  w  in  V  such 
that 

Tv  =  3vv  and  Tw  =  3v. 

Prove  that  3  or  —3  is  an  eigenvalue  of  T. 

Suppose  V  is  finite-dimensional  and  S,T  G  C(V).  Prove  that  ST  and 
7\S  have  the  same  eigenvalues. 

Suppose  A  is  an  n-by-n  matrix  with  entries  in  F.  Define  T  G  C(Fn ) 
by  Tx  =  ^4x,  where  elements  of  Fw  are  thought  of  as  n- by-1  column 
vectors. 

(a)  Suppose  the  sum  of  the  entries  in  each  row  of  A  equals  1 .  Prove 
that  1  is  an  eigenvalue  of  T. 

(b)  Suppose  the  sum  of  the  entries  in  each  column  of  A  equals  1. 
Prove  that  1  is  an  eigenvalue  of  T. 

Suppose  T  G  C(V)  and  u,v  are  eigenvectors  of  T  such  that  u  +  v 
is  also  an  eigenvector  of  T.  Prove  that  u  and  v  are  eigenvectors  of  T 
corresponding  to  the  same  eigenvalue. 

Suppose  T  G  jC(V)  is  such  that  every  nonzero  vector  in  V  is  an  eigen¬ 
vector  of  T.  Prove  that  T  is  a  scalar  multiple  of  the  identity  operator. 

Suppose  V  is  finite-dimensional  and  T  G  C(V)  is  such  that  every  sub¬ 
space  of  V  with  dimension  dim  V  —  1  is  invariant  under  T.  Prove  that  T 
is  a  scalar  multiple  of  the  identity  operator. 

Suppose  V  is  finite-dimensional  with  dim  V  >  3  and  T  G  C(V)  is  such 
that  every  2-dimensional  subspace  of  V  is  invariant  under  T.  Prove  that 
T  is  a  scalar  multiple  of  the  identity  operator. 

Suppose  T  G  C(V)  and  dimrangeT  =  k.  Prove  that  T  has  at  most 
k  +  1  distinct  eigenvalues. 

Suppose  T  g  £(R3)  and  —4,  5,  and  \fl  are  eigenvalues  of  T.  Prove  that 
there  exists  x  G  R3  such  that  T x  —  9x  =  (—4,  5,  Vl). 

Suppose  V  is  finite-dimensional  and  v\ , . . . ,  vm  is  a  list  of  vectors  in  V. 
Prove  that  v\ , . . . ,  vm  is  linearly  independent  if  and  only  if  there  exists 
T  G  C(V)  such  that  v\ , . . . ,  vm  are  eigenvectors  of  T  corresponding  to 
distinct  eigenvalues. 
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32  Suppose  Ai , . . . ,  Xn  is  a  list  of  distinct  real  numbers.  Prove  that  the  list 
e^lX, ... ,  e^nX  is  linearly  independent  in  the  vector  space  of  real- valued 
functions  on  R. 

Hint:  Let  V  =  span(p^lX, . . . ,  e^nX),  and  define  an  operator  T  e  CfV) 
by  Tf  =  ff.  Find  eigenvalues  and  eigenvectors  of  T . 

33  Suppose  T  e  C{V).  Prove  that  T / (range  T )  =  0. 

34  Suppose  T  e  C(V).  Prove  that  T /(null  T )  is  injective  if  and  only  if 
(null  T )  D  (range  T)  =  {0}. 

35  Suppose  V  is  finite-dimensional,  T  e  jC(V),  and  U  is  invariant  under  T. 
Prove  that  each  eigenvalue  of  T/U  is  an  eigenvalue  of  T. 

[ The  exercise  below  asks  you  to  verify  that  the  hypothesis  that  V  is 
finite-dimensional  is  needed  for  the  exercise  above.] 

36  Give  an  example  of  a  vector  space  V,  an  operator  T  e  C(V),  and 
a  subspace  U  of  V  that  is  invariant  under  T  such  that  T/U  has  an 
eigenvalue  that  is  not  an  eigenvalue  of  T. 
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Eigenvectors  and  Upper-Triangular 
Matrices 

Polynomials  Applied  to  Operators 

The  main  reason  that  a  richer  theory  exists  for  operators  (which  map  a  vector 
space  into  itself)  than  for  more  general  linear  maps  is  that  operators  can  be 
raised  to  powers.  We  begin  this  section  by  defining  that  notion  and  the  key 
concept  of  applying  a  polynomial  to  an  operator. 

If  T  G  C(V ),  then  T  T  makes  sense  and  is  also  in  C{V).  We  usually  write 
T 2  instead  of  T  T.  More  generally,  we  have  the  following  definition. 


5.B 


5.16  Definition  Tm 


Suppose  T  G  C(V)  and  m  is  a  positive  integer. 
•  Tm  is  defined  by 


m  times 


•  T°  is  defined  to  be  the  identity  operator  I  on  V. 

•  If  T  is  invertible  with  inverse  T~x ,  then  T~m  is  defined  by 

T-m  = 


You  should  verify  that  if  T  is  an  operator,  then 

j-m  —  Y’U'i+n  ajuJ  (Tm )^  =  T 


mn 


where  m  and  n  are  allowed  to  be  arbitrary  integers  if  T  is  invertible  and 
nonnegative  integers  if  T  is  not  invertible. 


5.17  Definition  p(T ) 

Suppose  T  G  jC(V)  and  p  G  V(F)  is  a  polynomial  given  by 

p{z)  =  cio  +  a\z  +  a2z 2  H - b  amzm 

for  z  G  F.  Then  p(T)  is  the  operator  defined  by 

p(T)  =  do  I  -\-  ct  i  T  +  a2T2  +  •  •  •  +  amTm . 


This  is  a  new  use  of  the  symbol  p  because  we  are  applying  it  to  operators, 
not  just  elements  of  F. 
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5.18  Example  Suppose  D  G  ^(^(R))  is  the  differentiation  operator 
defined  by  Dq  =  qf  and  p  is  the  polynomial  defined  by  p(x)  =  7  —  3x  +  5x2. 
Then  p(D)  —  II  —  3 D  +  5 D2\  thus 

(p(D))q  =lq~  3 qf  +  5q " 

for  every  q  g  ^(R). 

If  we  fix  an  operator  T  g  C(V),  then  the  function  from  V(F)  to  C{V) 
given  by  p  i->  p(T)  is  linear,  as  you  should  verify. 

5.19  Definition  product  of  polynomials 

If  p,  q  G  'P(F),  then  pq  G  V(F)  is  the  polynomial  defined  by 

(pq)(z)  =  p(z)q(z) 

for  z  G  F. 

Any  two  polynomials  of  an  operator  commute,  as  shown  below. 

5.20  Multiplicative  properties 

Suppose  p,q  e  V(F)  and  T  g  jC(V). 

Then 

(a)  (pq)(T)  =  p(T)q(T); 

(b)  p(T)q(T)  =  q(T)p(T). 

Proof 

(a)  Suppose  p(z)  =  ajzJ  and  q(z)  =  Ylk=o^kzk  for  z  G  F. 

Then 

m  n 

(pq)(z)  =  y^ajbkzjJrk . 
j=0k=0 

Thus 

m  n 

(pq)(T)  =  5 ZaJbkTJ+k 

7=0  k= 0 

m  n 

7=0  fc  =  0 

=  p(T)q(T). 

(b)  Part  (a)  implies  p{T)q(T)  =  ( pq)(T )  =  (qp)(T)  =  q(T)p(T).  a 


Part  (a)  holds  because  when  ex¬ 
panding  a  product  of  polynomials 
using  the  distributive  property,  it 
does  not  matter  whether  the  sym¬ 
bol  is  z  or  T. 
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Existence  of  Eigenvalues 

Now  we  come  to  one  of  the  central  results  about  operators  on  complex  vector 
spaces. 

5.21  Operators  on  complex  vector  spaces  have  an  eigenvalue 

Every  operator  on  a  finite-dimensional,  nonzero,  complex  vector  space 
has  an  eigenvalue. 

Proof  Suppose  V  is  a  complex  vector  space  with  dimension  n  >  0  and 
T  g  C(V).  Choose  v  e  V  with  v/0.  Then 

v,  T v,  T2v,  . . . ,  Tnv 

is  not  linearly  independent,  because  V  has  dimension  n  and  we  have  n  +  1 
vectors.  Thus  there  exist  complex  numbers  ao, . . . ,  an,  not  all  0,  such  that 


0  =  aov  +  a\Tv  +  •  •  •  +  anTnv. 

Note  that  a\ , . . . ,  an  cannot  all  be  0,  because  otherwise  the  equation  above 
would  become  0  =  a$v ,  which  would  force  ao  also  to  be  0. 

Make  the  a’s  the  coefficients  of  a  polynomial,  which  by  the  Fundamental 
Theorem  of  Algebra  (4.14)  has  a  factorization 

cio  “I-  +  *  *  *  +  unzn  —  c(z  —  X\)  •  •  •  (z  —  Xm), 


where  c  is  a  nonzero  complex  number,  each  Xj  is  in  C,  and  the  equation  holds 
for  all  z  G  C  (here  m  is  not  necessarily  equal  to  n,  because  an  may  equal  0). 
We  then  have 


0  =  aov  +  a\Tv  +  •  •  •  +  anTnv 
=  ( clqI  -\-  a i  T  +  •  •  •  +  an  Tn)v 
=  c(T  —  X\I)  •  •  •  (T  —  XmI)v. 


Thus  T  —  Xj  I  is  not  injective  for  at  least  one  j .  In  other  words,  T  has  an 
eigenvalue.  ■ 

The  proof  above  depends  on  the  Fundamental  Theorem  of  Algebra,  which 
is  typical  of  proofs  of  this  result.  See  Exercises  16  and  17  for  possible  ways  to 
rewrite  the  proof  above  using  the  idea  of  the  proof  in  a  slightly  different  form. 


146 


CHAPTER  5  Eigenvalues,  Eigenvectors,  and  Invariant  Subspaces 


Upper- Triangular  Matrices 

In  Chapter  3  we  discussed  the  matrix  of  a  linear  map  from  one  vector  space 
to  another  vector  space.  That  matrix  depended  on  a  choice  of  a  basis  of  each 
of  the  two  vector  spaces.  Now  that  we  are  studying  operators,  which  map  a 
vector  space  to  itself,  the  emphasis  is  on  using  only  one  basis. 


5.22  Definition  matrix  of  an  operator ;  A4(T) 

Suppose  T  e  C(V)  and  v\ , . . . ,  vn  is  a  basis  of  V.  The  matrix  of  T  with 
respect  to  this  basis  is  the  n-by-n  matrix 

(  •  •  •  A\in  \ 


M(T )  = 


\  An\  . . .  An^n  J 
whose  entries  Aj  ^  are  defined  by 


Tvk  —  A\  k^\  +  *  *  *  +  An  kVn> 


If  the  basis  is  not  clear  from  the  context,  then  the  notation 
M  (T,  (vi  , . . . ,  vw))  is  used. 


Note  that  the  matrices  of  operators  are  square  arrays,  rather  than  the  more 
general  rectangular  arrays  that  we  considered  earlier  for  linear  maps. 

If  T  is  an  operator  on  F72  and  no 
basis  is  specified,  assume  that  the  basis 
in  question  is  the  standard  one  (where 
the  yth  basis  vector  is  1  in  the  jth  slot 
and  0  in  all  the  other  slots).  You  can 
then  think  of  the  jth  column  of  M(T)  as  T  applied  to  the  jth  basis  vector. 


The  kth  column  of  the  matrix 
J\ 4(T)  is  formed  from  the  coeffi¬ 
cients  used  to  write  Tv^  as  a  linear 
combination  of  v  i, ...  ,vn. 


5.23  Example 

Then 


Define  T  e  £(F3)  by  T(x,  y ,  z) 

/  2  1  0 
M(T)  =053 
V  0  0  8 


(2 x+y,  5y  +  3z,  8 z). 


A  central  goal  of  linear  algebra  is  to  show  that  given  an  operator  T  e  C(V), 
there  exists  a  basis  of  V  with  respect  to  which  T  has  a  reasonably  simple 
matrix.  To  make  this  vague  formulation  a  bit  more  precise,  we  might  try  to 
choose  a  basis  of  V  such  that  M(T )  has  many  0’s. 
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If  V  is  a  finite-dimensional  complex  vector  space,  then  we  already  know 
enough  to  show  that  there  is  a  basis  of  V  with  respect  to  which  the  matrix  of 
T  has  0’s  everywhere  in  the  first  column,  except  possibly  the  first  entry.  In 
other  words,  there  is  a  basis  of  V  with  respect  to  which  the  matrix  of  T  looks 
like 

A  \ 

0  * 

.  9 

V  0  ) 

here  the  *  denotes  the  entries  in  all  the  columns  other  than  the  first  column. 
To  prove  this,  let  X  be  an  eigenvalue  of  T  (one  exists  by  5.21)  and  let  v  be  a 
corresponding  eigenvector.  Extend  v  to  a  basis  of  V.  Then  the  matrix  of  T 
with  respect  to  this  basis  has  the  form  above. 

Soon  we  will  see  that  we  can  choose  a  basis  of  V  with  respect  to  which 
the  matrix  of  T  has  even  more  0’s. 

5.24  Definition  diagonal  of  a  matrix 

The  diagonal  of  a  square  matrix  consists  of  the  entries  along  the  line  from 
the  upper  left  corner  to  the  bottom  right  corner. 

For  example,  the  diagonal  of  the  matrix  in  5.23  consists  of  the  entries 
2,5,8. 

5.25  Definition  upper-triangular  matrix 

A  matrix  is  called  upper  triangular  if  all  the  entries  below  the  diagonal 
equal  0. 

For  example,  the  matrix  in  5.23  is  upper  triangular. 

Typically  we  represent  an  upper-triangular  matrix  in  the  form 

(  X\  *  \ 

•  • 

*  .  9 

0  X„  ) 

the  0  in  the  matrix  above  indicates  often  use  *  to  denote  matrix  en- 

that  all  entries  below  the  diagonal  in  tries  that  wc  do  not  know  about  or 

this  n -by -/I  matrix  equal  0.  Upper-  that  are  irrelevant  to  the  questions 

triangular  matrices  can  be  considered  being  discussed. 

reasonably  simple — for  n  large,  almost 

half  its  entries  in  an  n-by-n  upper- 

triangular  matrix  are  0. 


We  often  use  *  to  denote  matrix  en¬ 
tries  that  we  do  not  know  about  or 
that  are  irrelevant  to  the  questions 
being  discussed. 
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The  following  proposition  demonstrates  a  useful  connection  between 
upper-triangular  matrices  and  invariant  subspaces. 

5.26  Conditions  for  upper-triangular  matrix 

Suppose  T  g  C(V)  and  vi , . . . ,  vn  is  a  basis  of  V.  Then  the  following  are 
equivalent: 

(a)  the  matrix  of  T  with  respect  to  v\ , . . . ,  vn  is  upper  triangular; 

(b)  Tvj  G  span(vi, . . . ,  v/)  for  each  j  =  1, . . .  ,n\ 

(c)  span(vi, . . . ,  Vj)  is  invariant  under  T  for  each  j  =  1, . . . ,  n. 

Proof  The  equivalence  of  (a)  and  (b)  follows  easily  from  the  definitions  and 
a  moment’s  thought.  Obviously  (c)  implies  (b).  Hence  to  complete  the  proof, 
we  need  only  prove  that  (b)  implies  (c). 

Thus  suppose  (b)  holds.  Fix  j  G  {1, . . . ,  n}.  From  (b),  we  know  that 

Tv i  G  span(vi)  C  span(vi, . . . ,  v7  ); 

Tv 2  G  span(vi ,  V2)  C  span(vi , . . . ,  vj ); 


Tv  j  G  span(vi, . . . ,  vy). 

Thus  if  v  is  a  linear  combination  of  v\ , . . . ,  Vj ,  then 

Tv  G  span(vi, . . . ,  vy). 

In  other  words,  span(vi , . . . ,  vj )  is  invariant  under  T,  completing  the  proof.  ■ 

Now  we  can  prove  that  for  each 
operator  on  a  finite-dimensional  com¬ 
plex  vector  space,  there  is  a  basis  of  the 
vector  space  with  respect  to  which  the 
matrix  of  the  operator  has  only  0’s  be¬ 
low  the  diagonal.  In  Chapter  8  we  will 
improve  even  this  result. 

Sometimes  more  insight  comes  from 
seeing  more  than  one  proof  of  a  theo¬ 
rem.  Thus  two  proofs  are  presented  of 
the  next  result.  Use  whichever  appeals 
more  to  you. 


The  next  result  does  not  hold  on 
real  vector  spaces,  because  the  first 
vector  in  a  basis  with  respect  to 
which  an  operator  has  an  upper- 
triangular  matrix  is  an  eigenvector 
of  the  operator.  Thus  if  an  opera¬ 
tor  on  a  real  vector  space  has  no 
eigenvalues  [ see  5.8(a)  for  an  ex¬ 
ample],  then  there  is  no  basis  with 
respect  to  which  the  operator  has 
an  upper-triangular  matrix. 
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5.27  Over  C,  every  operator  has  an  upper-triangular  matrix 

Suppose  V  is  a  finite-dimensional  complex  vector  space  and  T  G  C{V). 
Then  T  has  an  upper-triangular  matrix  with  respect  to  some  basis  of  V. 

Proof  1  We  will  use  induction  on  the  dimension  of  V.  Clearly  the  desired 
result  holds  if  dim  V  =  1 . 

Suppose  now  that  dim  V  >  1  and  the  desired  result  holds  for  all  complex 
vector  spaces  whose  dimension  is  less  than  the  dimension  of  V.  Let  X  be  any 
eigenvalue  of  T  (5.21  guarantees  that  T  has  an  eigenvalue).  Let 

U  =  rang  e(T  —  XI). 

Because  T  —  XI  is  not  surjective  (see  3.69),  dim  U  <  dim  V.  Furthermore, 
U  is  invariant  under  T.  To  prove  this,  suppose  u  G  U.  Then 

T  u  =  (T  —  XI)u  +  Xu. 

Obviously  (T  —  XI)u  G  U  (because  U  equals  the  range  of  T  —  XI)  and 
Xu  G  U.  Thus  the  equation  above  shows  that  Tu  e  U.  Hence  U  is  invariant 
under  T,  as  claimed. 

Thus  T\u  is  an  operator  on  U.  By  our  induction  hypothesis,  there  is  a 
basis  u\, ...  ,um  of  U  with  respect  to  which  T\jj  has  an  upper-triangular 
matrix.  Thus  for  each  j  we  have  (using  5.26) 

5.28  T Uj  =  (T\u)(uj)  g  span(^i, . . . ,  Uj). 

Extend  u  \ , . . . ,  um  to  a  basis  u  \ , . . . ,  um ,  v\ , . . . ,  vn  of  V.  For  each  k ,  we 
have 

Tvk  =  (T  -XI)vk  +  Xvk. 

The  definition  of  U  shows  that  (T  —  XI)vk  G  U  =  span(i/i, . . . ,  um).  Thus 
the  equation  above  shows  that 

5.29  Tvk  G  span(^i, . . . ,  um ,  vi, . . . ,  v^). 

From  5.28  and  5.29,  we  conclude  (using  5.26)  that  T  has  an  upper- 
triangular  matrix  with  respect  to  the  basis  u\ , . . . ,  um,  v\ , . . . ,  vn  of  V,  as 
desired.  ■ 
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Proof  2  We  will  use  induction  on  the  dimension  of  V.  Clearly  the  desired 
result  holds  if  dim  V  =  1 . 

Suppose  now  that  dim  V  —  n  >  1  and  the  desired  result  holds  for  all 
complex  vector  spaces  whose  dimension  is  n  —  1.  Let  v\  be  any  eigenvector 
of  T  (5.21  guarantees  that  T  has  an  eigenvector).  Let  U  =  span(vi).  Then  U 
is  an  invariant  subspace  of  T  and  dim  U  —  1. 

Because  dim  V/U  =  n  —  1  (see  3.89),  we  can  apply  our  induction  hy¬ 
pothesis  to  T/U  E  C(V/U).  Thus  there  is  a  basis  V2  +  U, . . . ,  vn  +  U  of 
V/U  such  that  T/U  has  an  upper-triangular  matrix  with  respect  to  this  basis. 
Hence  by  5.26, 


(T /  C/)(v j  +  U)  E  span(v2  +  U, . . . ,  yy  +  £/) 

for  each  j  =  2, . . . ,  n.  Unraveling  the  meaning  of  the  inclusion  above,  we 
see  that 

T yy  E  span(vi, . . . ,  yy) 

for  each  j  =  l, ...  ,n.  Thus  by  5.26,  T  has  an  upper- triangular  matrix 
with  respect  to  the  basis  v\ , . . . ,  vn  of  V,  as  desired  (it  is  easy  to  verify  that 
vi , . . . ,  vn  is  a  basis  of  V ;  see  Exercise  13  in  Section  3.E  for  a  more  general 
result).  ■ 

How  does  one  determine  from  looking  at  the  matrix  of  an  operator  whether 
the  operator  is  invertible?  If  we  are  fortunate  enough  to  have  a  basis  with 
respect  to  which  the  matrix  of  the  operator  is  upper  triangular,  then  this 
problem  becomes  easy,  as  the  following  proposition  shows. 

5.30  Determination  of  invertibility  from  upper-triangular  matrix 

Suppose  T  E  C(V)  has  an  upper-triangular  matrix  with  respect  to  some 
basis  of  V.  Then  T  is  invertible  if  and  only  if  all  the  entries  on  the  diagonal 
of  that  upper- triangular  matrix  are  nonzero. 

Proof  Suppose  vi, . . . ,  vn  is  a  basis  of  V  with  respect  to  which  T  has  an 
upper-triangular  matrix 


5.31 


M(T) 


Aj 

*  > 

A2 

• 

0 

• 

• 

J 

We  need  to  prove  that  T  is  invertible  if  and  only  if  all  the  Ay’s  are  nonzero. 
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First  suppose  the  diagonal  entries  X\ , . . . ,  Xn  are  all  nonzero.  The  upper- 
triangular  matrix  in  5.31  implies  that  Tv \  =  Aivi.  Because  Ai  7^  0,  we  have 
T(v  1/A1)  =  vi;  thus  vi  Granger. 

Now 

T(v2/X2)  =  av  1  +  v2 

for  some  a  G  F.  The  left  side  of  the  equation  above  and  av  1  are  both  in 
range  T ;  thus  v2  G  range  T. 

Similarly,  we  see  that 

T  (V3 /A3)  =  bv  1  +  cv2  +  V3 

for  some  b,  c  G  F.  The  left  side  of  the  equation  above  and  bv\ ,  cv2  are  all  in 
range  T ;  thus  v3  G  range  T. 

Continuing  in  this  fashion,  we  conclude  that  vi, . . . ,  vn  G  range  T.  Be¬ 
cause  vi , . . . ,  vn  is  a  basis  of  V,  this  implies  that  range  T  =  V.  In  other  words, 
T  is  surjective.  Hence  T  is  invertible  (by  3.69),  as  desired. 

To  prove  the  other  direction,  now  suppose  that  T  is  invertible.  This  implies 
that  Ai  7^  0,  because  otherwise  we  would  have  Tv \  =0. 

Let  1  <  j  <  n,  and  suppose  A j  —  0.  Then  5.31  implies  that  T  maps 
span(vi, . . . ,  Vj)  into  span(vi, . . . ,  Vj~  1).  Because 

dim  span(vi , . . . ,  vj )  =  j  and  dim  span(vi , . . . ,  vj~  1)  =  j  —  1 , 

this  implies  that  T  restricted  to  dim  span(vi , . . . ,  Vj)  is  not  injective  (by  3.23). 
Thus  there  exists  v  G  span(vi , . . . ,  Vj)  such  that  v/0  and  Tv  —  0.  Thus  T 
is  not  injective,  which  contradicts  our  hypothesis  (for  this  direction)  that  T  is 
invertible.  This  contradiction  means  that  our  assumption  that  A  j  =  0  must  be 
false.  Hence  A  j  7^  0,  as  desired.  ■ 

As  an  example  of  the  result  above,  we  see  that  the  operator  in  Example  5.23 
is  invertible. 

Unfortunately  no  method  exists  for 
exactly  computing  the  eigenvalues  of 
an  operator  from  its  matrix.  However, 
if  we  are  fortunate  enough  to  find  a  ba¬ 
sis  with  respect  to  which  the  matrix  of 
the  operator  is  upper  triangular,  then  the 
problem  of  computing  the  eigenvalues 
becomes  trivial,  as  the  following  propo¬ 
sition  shows. 


Powerful  numeric  techniques  exist 
for  finding  good  approximations  to 
the  eigenvalues  of  an  operator  from 
its  matrix. 
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5.32  Determination  of  eigenvalues  from  upper-triangular  matrix 

Suppose  T  e  C{V)  has  an  upper-triangular  matrix  with  respect  to  some 
basis  of  V.  Then  the  eigenvalues  of  T  are  precisely  the  entries  on  the 
diagonal  of  that  upper-triangular  matrix. 

Proof  Suppose  v\, . . . ,  vn  is  a  basis  of  V  with  respect  to  which  T  has  an 
upper-triangular  matrix 


/  Ai 

M(T)  = 


0 


Let  A  e  F.  Then 


*  > 

An  J 


M{T  —  XI) 


0 


* 


\ 


A 


/ 


Hence  T  —  XI  is  not  invertible  if  and  only  if  X  equals  one  of  the  numbers 
X\, ...  ,Xn  (by  5.30).  Thus  X  is  an  eigenvalue  of  T  if  and  only  if  X  equals  one 
of  the  numbers  X\, ...  ,Xn.  m 


5.33  Example  Define  T  e  £(F3)  by  T(x,  y ,  z)  =  (2x  +  y,  5y  +  3z,  8 z). 
What  are  the  eigenvalues  of  T1 

Solution  The  matrix  of  T  with  respect  to  the  standard  basis  is 

/  2  1  0 
M(T)  =053 
V  0  0  8 

Thus  M(T )  is  an  upper-triangular  matrix.  Now  5.32  implies  that  the  eigen¬ 
values  of  T  are  2,  5,  and  8. 


Once  the  eigenvalues  of  an  operator  on  Fw  are  known,  the  eigenvectors 
can  be  found  easily  using  Gaussian  elimination. 
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EXERCISES  5.B 


1  Suppose  T  G  C(V)  and  there  exists  a  positive  integer  n  such  that  Tn  =  0. 

(a)  Prove  that  I  —  T  is  invertible  and  that 

(/  -  rr1  =  i  +  t  +  •  •  •  +  rn~l . 

(b)  Explain  how  you  would  guess  the  formula  above. 

2  Suppose  T  g  C(V)  and  ( T  -  2 I)(T  -  3 I)(T  -  41)  =  0.  Suppose  A  is 
an  eigenvalue  of  T.  Prove  that  A  =  2orA  =  3orA  =  4. 

3  Suppose  T  G  C(V)  and  T2  =  I  and  —1  is  not  an  eigenvalue  of  T.  Prove 
that  T  =  I. 

4  Suppose  P  G  C(V)  and  P2  =  P .  Prove  that  V  =  null  P  ©  range  P. 

5  Suppose  S,T  g  C{V)  and  S  is  invertible.  Suppose  p  G  V(F)  is  a 
polynomial.  Prove  that 

piSTS-1)  =  Sp(T)S~1. 

6  Suppose  T  G  C(V)  and  U  is  a  subspace  of  V  invariant  under  T.  Prove 
that  U  is  invariant  under  p(T)  for  every  polynomial  p  G  V(F). 

7  Suppose  T  G  £(U).  Prove  that  9  is  an  eigenvalue  of  T 2  if  and  only  if  3 
or  —3  is  an  eigenvalue  of  T. 

8  Give  an  example  of  T  g  £(R2)  su°h  that  T4  =  —  1. 

9  Suppose  V  is  finite-dimensional,  T  G  C(V),  and  v  G  U  with  v  /  0. 
Let  p  be  a  nonzero  polynomial  of  smallest  degree  such  that  p(T)v  =  0. 
Prove  that  every  zero  of  p  is  an  eigenvalue  of  T. 

10  Suppose  T  G  C(V)  and  v  is  an  eigenvector  of  T  with  eigenvalue  A. 
Suppose  p  G  V(F).  Prove  that  p(T)v  =  p( A)v. 

11  Suppose  F  =  C,  T  g  C(V),  p  G  P(C)  is  a  polynomial,  and  a  G  C. 
Prove  that  a  is  an  eigenvalue  of  p(T)  if  and  only  if  a  =  />(A)  for  some 
eigenvalue  A  of  T. 

12  Show  that  the  result  in  the  previous  exercise  does  not  hold  if  C  is  replaced 
with  R. 
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13  Suppose  IE  is  a  complex  vector  space  and  T  £  C(W)  has  no  eigenvalues. 
Prove  that  every  subspace  of  W  invariant  under  T  is  either  {0}  or  infinite¬ 
dimensional. 

14  Give  an  example  of  an  operator  whose  matrix  with  respect  to  some  basis 
contains  only  0’s  on  the  diagonal,  but  the  operator  is  invertible. 

[The  exercise  above  and  the  exercise  below  show  that  5.30  fails  without 
the  hypothesis  that  an  upper-triangular  matrix  is  under  consideration .] 

15  Give  an  example  of  an  operator  whose  matrix  with  respect  to  some  basis 
contains  only  nonzero  numbers  on  the  diagonal,  but  the  operator  is  not 
invertible. 

16  Rewrite  the  proof  of  5.21  using  the  linear  map  that  sends  p  £  Vn(C)  to 
{p(T))v  £  V  (and  use  3.23). 

17  Rewrite  the  proof  of  5.21  using  the  linear  map  that  sends  p  £  Vn2(C)  to 
p(T)  £  C(V)  (and  use  3.23). 

18  Suppose  V  is  a  finite-dimensional  complex  vector  space  and  T  £  £(V). 
Define  a  function  / :  C  — >  R  by 

/(A)  =  dimrange(7"  —  XI). 

Prove  that  /  is  not  a  continuous  function. 

19  Suppose  V  is  finite-dimensional  with  dim  V  >  1  and  T  £  £(V).  Prove 
that 

{p(T)  :  p  e  7>(F)}  #  C(V). 

20  Suppose  V  is  a  finite-dimensional  complex  vector  space  and  T  £  C(V). 
Prove  that  T  has  an  invariant  subspace  of  dimension  k  for  each  k  = 
1 , . . . ,  dim  V. 
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Eigenspaces  and  Diagonal  Matrices 

5.34  Definition  diagonal  matrix 

A  diagonal  matrix  is  a  square  matrix  that  is  0  everywhere  except  possibly 
along  the  diagonal. 


5.C 


5.35  Example 

/  8  0  0  \ 

0  5  0 
\  0  0  5  / 

is  a  diagonal  matrix. 

Obviously  every  diagonal  matrix  is  upper  triangular.  In  general,  a  diagonal 
matrix  has  many  more  0’s  than  an  upper-triangular  matrix. 

If  an  operator  has  a  diagonal  matrix  with  respect  to  some  basis,  then  the 
entries  along  the  diagonal  are  precisely  the  eigenvalues  of  the  operator;  this 
follows  from  5.32  (or  find  an  easier  proof  for  diagonal  matrices). 

5.36  Definition  eigenspace,  E(X,T) 

Suppose  T  g  C(V)  and  A  G  F.  The  eigenspace  of  T  corresponding  to  A, 
denoted  E{ A,  T),  is  defined  by 

E(X,T)  =  null(r-AZ). 

In  other  words,  E(X,T )  is  the  set  of  all  eigenvectors  of  T  corresponding 
to  A,  along  with  the  0  vector. 

For  T  g  C(V )  and  A  G  F,  the  eigenspace  E{X,T)  is  a  subspace  of  V 
(because  the  null  space  of  each  linear  map  on  V  is  a  subspace  of  V).  The 
definitions  imply  that  A  is  an  eigenvalue  of  T  if  and  only  if  E{X,T)  ^  {0}. 


5.37  Example  Suppose  the  matrix  of  an  operator  T  G  jC(V)  with  respect 
to  a  basis  v\ ,  V2,  V3  of  V  is  the  matrix  in  Example  5.35  above.  Then 

E( 8,  T)  =  span(vi),  E( 5,  T)  =  span(v2,  v3). 


If  A  is  an  eigenvalue  of  an  operator  T  G  C(V),  then  T  restricted  to 
E(X,T)  is  just  the  operator  of  multiplication  by  A. 


156 


CHAPTER  5  Eigenvalues,  Eigenvectors,  and  Invariant  Subspaces 


5.38  Sum  of  eigenspaces  is  a  direct  sum 

Suppose  V  is  finite-dimensional  and  T  e  C{V).  Suppose  also  that 
X\ , . . . ,  Xm  are  distinct  eigenvalues  of  T.  Then 


E(X\ ,T)  +  •  •  •  +  E(Xm ,  T) 
is  a  direct  sum.  Furthermore, 

dim  E(X  i ,  7")  +  •  •  •  +  dim  E  (Xm ,  T)  <  dim  V. 

Proof  To  show  that  E(Xi ,  T)  +  •  •  •  +  E(Xm ,  T)  is  a  direct  sum,  suppose 

u\  +  •  •  •  +  um  =  0, 

where  each  uj  is  in  E{X,  T).  Because  eigenvectors  corresponding  to  distinct 
eigenvalues  are  linearly  independent  (see  5.10),  this  implies  that  each  uj 

equals  0.  This  implies  (using  1.44)  that  E(X\,  T)  H - b  E(Xm,  T )  is  a  direct 

sum,  as  desired. 

Now 

dim£'(Ai,  T)  4 - f  dim E(Xm,  T )  =  dim(£'(Ai,  T)  ®  •  •  •  ®  E(Xm ,  T)) 

<  dim  V, 

where  the  equality  above  follows  from  Exercise  16  in  Section  2.C.  ■ 

5.39  Definition  diagonalizable 

An  operator  T  e  C(V)  is  called  diagonalizable  if  the  operator  has  a 
diagonal  matrix  with  respect  to  some  basis  of  V. 


5.40  Example  Define  T  e  £(R2)  by 

T (x,  y)  =  (41x  +  ly ,  — 20x  +  74y). 

The  matrix  of  T  with  respect  to  the  standard  basis  of  R2  is 

41  7  \ 

-20  74  )  ’ 

which  is  not  a  diagonal  matrix.  However,  T  is  diagonalizable,  because  the 
matrix  of  T  with  respect  to  the  basis  (1,  4),  (7,  5)  is 

69  0  \ 

0  46  J’ 

as  you  should  verify. 
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5.41  Conditions  equivalent  to  diagonalizability 

Suppose  V  is  finite-dimensional  and  T  e  C(V).  Let  Ai, . . . ,  Xm  denote 
the  distinct  eigenvalues  of  T.  Then  the  following  are  equivalent: 

(a)  T  is  diagonalizable; 

(b)  V  has  a  basis  consisting  of  eigenvectors  of  T ; 

(c)  there  exist  1 -dimensional  subspaces  U\ , . . . ,  Un  of  V,  each  invariant 
under  T,  such  that 

V  =  Ux  ©•••©£/„; 

(d)  V  = 

(e)  dim  V  =  dim£(Ai,  T)  +  •••  +  dirn£(Am,  T ). 

Proof  An  operator  T  e  C(V)  has  a  diagonal  matrix 

/  h  0  \ 

0  A*  / 

with  respect  to  a  basis  vi , . . . ,  vn  of  V  if  and  only  if  T  vy  =  A  y  vy  for  each  j . 
Thus  (a)  and  (b)  are  equivalent. 

Suppose  (b)  holds;  thus  V  has  a  basis  v\ , . . . ,  vn  consisting  of  eigenvectors 
of  T.  For  each  j ,  let  Uj  =  span(v7  ).  Obviously  each  U j  is  a  1 -dimensional 
subspace  of  V  that  is  invariant  under  T.  Because  v\ , . . . ,  vn  is  a  basis  of  V, 
each  vector  in  V  can  be  written  uniquely  as  a  linear  combination  of  v\ , . . . ,  vn . 

In  other  words,  each  vector  in  V  can  be  written  uniquely  as  a  sum  u  i  H - b  un , 

where  each  Uj  is  in  Uj .  Thus  V  =  U\  ©  •  •  •  ©  Un.  Hence  (b)  implies  (c). 

Suppose  now  that  (c)  holds;  thus  there  are  1 -dimensional  subspaces 
U\, . . . ,  Un  of  V,  each  invariant  under  T,  such  that  V  =  Ui  ©  •  •  •  ©  Un. 
For  each  j ,  let  Vj  be  a  nonzero  vector  in  Uj .  Then  each  Vj  is  an  eigenvector 

of  T.  Because  each  vector  in  V  can  be  written  uniquely  as  a  sum  u  \  H - b  un , 

where  each  uj  is  in  Uj  (so  each  uj  is  a  scalar  multiple  of  Vj ),  we  see  that 
vi , . . . ,  vn  is  a  basis  of  V.  Thus  (c)  implies  (b). 

At  this  stage  of  the  proof  we  know  that  (a),  (b),  and  (c)  are  all  equivalent. 
We  will  finish  the  proof  by  showing  that  (b)  implies  (d),  that  (d)  implies  (e), 
and  that  (e)  implies  (b). 
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Suppose  (b)  holds;  thus  V  has  a  basis  consisting  of  eigenvectors  of  T. 
Hence  every  vector  in  V  is  a  linear  combination  of  eigenvectors  of  T,  which 
implies  that 

V  =  E(\1,T)  +  ---  +  E(\m,T). 

Now  5.38  shows  that  (d)  holds. 

That  (d)  implies  (e)  follows  immediately  from  Exercise  16  in  Section  2.C. 
Finally,  suppose  (e)  holds;  thus 

5.42  dim  V  =  dim£(Ai,  T)  +  •  •  •  +  di mE(Xm,  T ). 

Choose  a  basis  of  each  E{Xj ,  T)\  put  all  these  bases  together  to  form  a  list 
vi, . . . ,  vn  of  eigenvectors  of  T,  where  n  =  dim  V  (by  5.42).  To  show  that 
this  list  is  linearly  independent,  suppose 


CL\V\  +  •  •  •  +  dnvn  —  0, 

where  a\ , . . . ,  an  e  F.  For  each  j  =  1, . . . ,  m,  let  Uj  denote  the  sum  of  all 
the  terms  such  that  e  E(Xj ,  T ).  Thus  each  Uj  is  in  E(Xj ,  T ),  and 

u\  +  •  •  •  +  um  —  0. 

Because  eigenvectors  corresponding  to  distinct  eigenvalues  are  linearly  inde¬ 
pendent  (see  5.10),  this  implies  that  each  uj  equals  0.  Because  each  uj  is  a 
sum  of  terms  a^v^,  where  the  v^’s  were  chosen  to  be  a  basis  of  E{Xj ,  T ),  this 
implies  that  all  the  s  equal  0.  Thus  v\ , . . . ,  vn  is  linearly  independent  and 
hence  is  a  basis  of  V  (by  2.39).  Thus  (e)  implies  (b),  completing  the  proof.  ■ 

Unfortunately  not  every  operator  is  diagonalizable.  This  sad  state  of  affairs 
can  arise  even  on  complex  vector  spaces,  as  shown  by  the  next  example. 

5.43  Example  Show  that  the  operator  T  e  £(C2)  defined  by 

T(w,z)  =  (z,  0) 


is  not  diagonalizable. 

Solution  As  you  should  verify,  0  is  the  only  eigenvalue  of  T  and  furthermore 
E{ 0,  T)  =  {(w,  0)  E  C2  :  w  e  C}. 

Thus  conditions  (b),  (c),  (d),  and  (e)  of  5.41  are  easily  seen  to  fail  (of 
course,  because  these  conditions  are  equivalent,  it  is  only  necessary  to  check 
that  one  of  them  fails).  Thus  condition  (a)  of  5.41  also  fails,  and  hence  T  is 
not  diagonalizable. 
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The  next  result  shows  that  if  an  operator  has  as  many  distinct  eigenvalues 
as  the  dimension  of  its  domain,  then  the  operator  is  diagonalizable. 

5.44  Enough  eigenvalues  implies  diagonalizability 

If  T  G  C(V)  has  dim  V  distinct  eigenvalues,  then  T  is  diagonalizable. 

Proof  Suppose  T  e  C(V)  has  dim  V  distinct  eigenvalues  Ai, . . . ,  A^im  v- 
For  each  j ,  let  vy  e  V  be  an  eigenvector  corresponding  to  the  eigenvalue  A  j . 
Because  eigenvectors  corresponding  to  distinct  eigenvalues  are  linearly  inde¬ 
pendent  (see  5.10),  vi, . . . ,  v^my  is  linearly  independent.  A  linearly  indepen¬ 
dent  list  of  dim  V  vectors  in  V  is  a  basis  of  V  (see  2.39);  thus  v\ , . . . ,  Vdim  v 
is  a  basis  of  V.  With  respect  to  this  basis  consisting  of  eigenvectors,  T  has  a 
diagonal  matrix.  ■ 


5.45  Example  Define  T  e  £(F3)  by  T (x,  y,  z)  =  (2x  +  y ,  5y  +  3z,  8 z). 
Find  a  basis  of  F3  with  respect  to  which  T  has  a  diagonal  matrix. 


Solution  With  respect  to  the  standard  basis,  the  matrix  of  T  is 


/  2  1  0 
0  5  3 
\0  0  8 


The  matrix  above  is  upper  triangular.  Thus  by  5.32,  the  eigenvalues  of  T  are 
2,  5,  and  8.  Because  T  is  an  operator  on  a  vector  space  with  dimension  3  and 
T  has  three  distinct  eigenvalues,  5.44  assures  us  that  there  exists  a  basis  of  F3 
with  respect  to  which  T  has  a  diagonal  matrix. 

To  find  this  basis,  we  only  have  to  find  an  eigenvector  for  each  eigenvalue. 
In  other  words,  we  have  to  find  a  nonzero  solution  to  the  equation 


T(x,  y,  z)  =  A (x,y,z) 


for  A  =  2,  then  for  A  =  5,  and  then  for  A  =  8.  These  simple  equations  are 
easy  to  solve:  for  A  =  2  we  have  the  eigenvector  (1, 0, 0);  for  A  =  5  we  have 
the  eigenvector  (1, 3, 0);  for  A  =  8  we  have  the  eigenvector  (1, 6,  6). 

Thus  (1,0, 0),  (1, 3, 0),  (1, 6,  6)  is  a  basis  of  F3,  and  with  respect  to  this 
basis  the  matrix  of  T  is 

/  2  0  0  \ 

0  5  0  . 

\  0  0  8  / 
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The  converse  of  5.44  is  not  true.  For  example,  the  operator  T  defined  on 
the  three-dimensional  space  F1 2 3 4  by 

T(zuz2,z3)  =  (4z1,4z2,5z3) 

has  only  two  eigenvalues  (4  and  5),  but  this  operator  has  a  diagonal  matrix 
with  respect  to  the  standard  basis. 

In  later  chapters  we  will  find  additional  conditions  that  imply  that  certain 
operators  are  diagonalizable. 

EXERCISES  5.C 

1  Suppose  T  G  C(V)  is  diagonalizable.  Prove  that  V  =  null  T  ©  range  T. 

2  Prove  the  converse  of  the  statement  in  the  exercise  above  or  give  a 
counterexample  to  the  converse. 

3  Suppose  V  is  finite-dimensional  and  T  G  jC(V).  Prove  that  the  following 
are  equivalent: 

(a)  V  =  null  T  ©  range  T. 

(b)  V  =  null  T  +  range  T. 

(c)  null  T  fi  range  T  =  {0}. 

4  Give  an  example  to  show  that  the  exercise  above  is  false  without  the 
hypothesis  that  V  is  finite-dimensional. 

5  Suppose  V  is  a  finite-dimensional  complex  vector  space  and  T  G  C{V). 
Prove  that  T  is  diagonalizable  if  and  only  if 

V  =  null(T  -  XI)  ©  range (T  -  XI) 

for  every  X  G  C. 

6  Suppose  V  is  finite-dimensional,  T  G  C(V)  has  dim  V  distinct  eigenval¬ 
ues,  and  S  G  C{V)  has  the  same  eigenvectors  as  T  (not  necessarily  with 
the  same  eigenvalues).  Prove  that  ST  =  TS . 

7  Suppose  T  G  jC(V)  has  a  diagonal  matrix  A  with  respect  to  some  basis 
of  V  and  that  X  G  F.  Prove  that  X  appears  on  the  diagonal  of  A  precisely 
dim  E(X,  T)  times. 

8  Suppose  T  G  £(F5 6 7 8)  and  dim  i?(8,  T)  =  4.  Prove  that  T  —  21  or  T  —  61 
is  invertible. 
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9  Suppose  T  G  C(V)  is  invertible.  Prove  that  E(X,  T )  =  E(j,  T  x)  for 
every  A  G  F  with  A  /  0. 

10  Suppose  that  V  is  finite-dimensional  and  T  G  C(V).  Let  Ai, . . . ,  Xm 
denote  the  distinct  nonzero  eigenvalues  of  T.  Prove  that 

dim  E(X  i ,  T)  +  •  •  •  +  dim  E  ( Xm ,  T)  <  dim  range  T. 


11  Verify  the  assertion  in  Example  5.40. 

12  Suppose  R,T  g  £(F3)  each  have  2,  6,  7  as  eigenvalues.  Prove  that  there 
exists  an  invertible  operator  S  G  £(F3)  such  that  R  =  S~lTS. 

13  Find  R,T  G  £(F4)  such  that  and  T  each  have  2,  6,  7  as  eigenvalues, 
R  and  T  have  no  other  eigenvalues,  and  there  does  not  exist  an  invertible 
operator  S  G  £(F4)  such  that  R  —  S~1TS. 

14  Find  T  G  C3)  such  that  6  and  7  are  eigenvalues  of  T  and  such  that  T 
does  not  have  a  diagonal  matrix  with  respect  to  any  basis  of  C3. 

15  Suppose  T  G  £(C3)  is  such  that  6  and  7  are  eigenvalues  of  T.  Fur¬ 
thermore,  suppose  T  does  not  have  a  diagonal  matrix  with  respect 
to  any  basis  of  C3.  Prove  that  there  exists  (x,y,z)  G  F3  such  that 
T (x,  y ,  z)  =  (17  +  8x,  V5  +  8 y ,  2i r  +  8 z). 


16  The  Fibonacci  sequence  F\ ,  F2,  •  •  •  is  defined  by 

Fi  =  1,  F2  =  1,  and  +  FA-i  forn  >  3. 

Define  T  g  >C(R2)  by  7"(x,  y)  =  (y,  x  +  y). 


(a) 

(b) 

(c) 

(d) 


Show  that  Tn{ 0, 1)  =  ( Fn ,  Fn+\ )  for  each  positive  integer  n. 
Find  the  eigenvalues  of  T. 

Find  a  basis  of  R2  consisting  of  eigenvectors  of  T. 

Use  the  solution  to  part  (c)  to  compute  Tn{ 0,  1).  Conclude  that 


1  r/i  +  Vsy 

vsLv  2  ) 


n-i 


for  each  positive  integer  n. 

Use  part  (d)  to  conclude  that  for  each  positive  integer  n,  the 
Fibonacci  number  Fn  is  the  integer  that  is  closest  to 


1  /1  +  V5\" 

Vs{ 
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Inner  Product  Spaces 


In  making  the  definition  of  a  vector  space,  we  generalized  the  linear  structure 
(addition  and  scalar  multiplication)  of  R2  and  R3 .  We  ignored  other  important 
features,  such  as  the  notions  of  length  and  angle.  These  ideas  are  embedded 
in  the  concept  we  now  investigate,  inner  products. 

Our  standing  assumptions  are  as  follows: 

6.1  Notation  F,  V 

•  F  denotes  R  or  C. 

•  V  denotes  a  vector  space  over  F. 


LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  Cauchy-Schwarz  Inequality 

■  Gram-Schmidt  Procedure 

■  linear  functionals  on  inner  product  spaces 

■  calculating  minimum  distance  to  a  subspace 
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6.A 


Inner  Products  and  Norms 


Inner  Products 


To  motivate  the  concept  of  inner  prod¬ 
uct,  think  of  vectors  in  R2  and  R3  as 
arrows  with  initial  point  at  the  origin. 
The  length  of  a  vector  x  in  R2  or  R3 
is  called  the  norm  of  x,  denoted  ||x||. 
Thus  for  x  =  (xi, X2)  e  R2,  we  have 
||x  ||  =  \/x\2  +  X22. 

Similarly,  if  x  =  (xi,  X2,  X3)  e  R3, 
then  ||x ||  =  sj x\2  +  X22  +  X32. 

Even  though  we  cannot  draw  pictures  in  higher  dimensions,  the  gener¬ 
alization  to  R^  is  obvious:  we  define  the  norm  of  x  =  (xi, . . . ,  xn)  e  Rw 
by 


The  length  of  this  vector  x  is 
■\J  X\2  +  xf2 . 


VXl 


+  *  *  •  +  xn 


The  norm  is  not  linear  on  Rn.  To  inject  linearity  into  the  discussion,  we 
introduce  the  dot  product. 


6.2  Definition  dot  product 

For  x,y  e  Rn,  the  dot  product  of  x  and  y,  denoted  x  •  y,  is  defined  by 

x  •  y  =  x\yi  H - h  xnyn , 

where  x  =  (xi , . . . ,  xn)  and  y  =  (yi , . . . ,  yn ). 


Note  that  the  dot  product  of  two  vec¬ 
tors  in  Rn  is  a  number,  not  a  vector.  Ob¬ 
viously  x  •  x  =  ||x||2  for  all  x  e  Rn. 
The  dot  product  on  Rn  has  the  follow¬ 
ing  properties: 


•  x  •  x  =  0  if  and  only  if  x  =  0; 

•  for  y  g  Rn  fixed,  the  map  from  Rn  to  R  that  sends  x  e  Rn  to  x  •  y  is 
linear; 

•  x  •  y  =  y  •  x  for  all  x,  y  G  Rn. 


If  we  think  of  vectors  as  points  in¬ 
stead  of  arrows,  then  ||x ||  should 
he  interpreted  as  the  distance  from 
the  origin  to  the  point  x. 


•  x  •  x  >  0  for  all  x  eR”; 
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An  inner  product  is  a  generalization  of  the  dot  product.  At  this  point  you 
may  be  tempted  to  guess  that  an  inner  product  is  defined  by  abstracting  the 
properties  of  the  dot  product  discussed  in  the  last  paragraph.  For  real  vector 
spaces,  that  guess  is  correct.  However,  so  that  we  can  make  a  definition  that 
will  be  useful  for  both  real  and  complex  vector  spaces,  we  need  to  examine 
the  complex  case  before  making  the  definition. 

Recall  that  if  A  =  a  +  bi,  where  a,  b  e  R,  then 


the  absolute  value  of  A,  denoted  | A  |,  is  defined  by  | A  |  =  Va2  +  b2\ 


•  the  complex  conjugate  of  A,  denoted  A,  is  defined  by  A  =  a  —  bi\ 


See  Chapter  4  for  the  definitions  and  the  basic  properties  of  the  absolute  value 
and  complex  conjugate. 

For  z  =  (z i , . . . ,  zn)  e  Cn,  we  define  the  norm  of  z  by 


The  absolute  values  are  needed  because  we  want 
number.  Note  that 


—  Z\Z\  +  •••  -\-znz 


n 


to  be  a  nonnegative 


We  want  to  think  of  ||z||2  as  the  inner  product  of  z  with  itself,  as  we 
did  in  Rn.  The  equation  above  thus  suggests  that  the  inner  product  of 
w  =  (w i , . . . ,  wn)  e  Cn  with  z  should  equal 


wizi  +  •  •  •  +  wnzn . 

If  the  roles  of  the  w  and  z  were  interchanged,  the  expression  above  would 
be  replaced  with  its  complex  conjugate.  In  other  words,  we  should  expect 
that  the  inner  product  of  w  with  z  equals  the  complex  conjugate  of  the  inner 
product  of  z  with  w.  With  that  motivation,  we  are  now  ready  to  define  an 
inner  product  on  V,  which  may  be  a  real  or  a  complex  vector  space. 

Two  comments  about  the  notation  used  in  the  next  definition: 

•  If  A  is  a  complex  number,  then  the  notation  A  >  0  means  that  A  is  real 
and  nonnegative. 

•  We  use  the  common  notation  (u,  v),  with  angle  brackets  denoting  an 
inner  product.  Some  people  use  parentheses  instead,  but  then  ( u ,  v) 
becomes  ambiguous  because  it  could  denote  either  an  ordered  pair  or 
an  inner  product. 
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6.3  Definition  inner  product 

An  inner  product  on  V  is  a  function  that  takes  each  ordered  pair  ( u ,  v)  of 
elements  of  V  to  a  number  (u,v)  gF  and  has  the  following  properties: 

positivity 

(v,  v)  >  0  for  all  v  G  V ; 

definiteness 

(v,  v)  =  0  if  and  only  if  v  =  0; 

additivity  in  first  slot 

(u  +  v,  w)  =  (u,  w)  +  (v,  w)  for  all  u,v,w  G  V; 

homogeneity  in  first  slot 

(Xu,  v)  =  X(u,  v)  for  all  X  G  F  and  all  u,  v  G  V; 

conjugate  symmetry 

(u,  v)  =  (v,  u)  for  all  u,  v  G  V. 


Although  most  mathematicians  de¬ 
fine  an  inner  product  as  above, 
many  physicists  use  a  definition 
that  requires  homogeneity  in  the 
second  slot  instead  of  the  first  slot. 


Every  real  number  equals  its  com¬ 
plex  conjugate.  Thus  if  we  are  dealing 
with  a  real  vector  space,  then  in  the  last 
condition  above  we  can  dispense  with 
the  complex  conjugate  and  simply  state 
that  (u,  v)  =  (v,  u)  for  all  v,  w  G  V. 


6.4  Example  inner  products 

(a)  The  Euclidean  inner  product  on  F"is  defined  by 

{(wi, . .  .,wn),  (z  1, . . .  ,zn))  =  wizT  H - 1-  Wnz^. 

(b)  If  ci , . . . ,  cn  are  positive  numbers,  then  an  inner  product  can  be  defined 
on  Fw  by 

((wi, . . . ,  wn),  (z  1, . . .  ,z„)>  =  ciWiFf  H - 1-  cnwnz n. 

(c)  An  inner  product  can  be  defined  on  the  vector  space  of  continuous 
real- valued  functions  on  the  interval  [—1,  1]  by 

(f,g)  =  j  f(x)g(x)dx. 

(d)  An  inner  product  can  be  defined  on  ^(R)  by 

POO 

(p,q)  =  /  p{x)q{x)e~x  dx. 

Jo 
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6.5  Definition  inner  product  space 

An  inner  product  space  is  a  vector  space  V  along  with  an  inner  product 
on  V. 

The  most  important  example  of  an  inner  product  space  is  Fw  with  the 
Euclidean  inner  product  given  by  part  (a)  of  the  last  example.  When  Fn  is 
referred  to  as  an  inner  product  space,  you  should  assume  that  the  inner  product 
is  the  Euclidean  inner  product  unless  explicitly  told  otherwise. 

So  that  we  do  not  have  to  keep  repeating  the  hypothesis  that  V  is  an  inner 
product  space,  for  the  rest  of  this  chapter  we  make  the  following  assumption: 

6.6  Notation  V 

For  the  rest  of  this  chapter,  V  denotes  an  inner  product  space  over  F. 

Note  the  slight  abuse  of  language  here.  An  inner  product  space  is  a  vector 
space  along  with  an  inner  product  on  that  vector  space.  When  we  say  that 
a  vector  space  V  is  an  inner  product  space,  we  are  also  thinking  that  an 
inner  product  on  V  is  lurking  nearby  or  is  obvious  from  the  context  (or  is  the 
Euclidean  inner  product  if  the  vector  space  is  Fn). 

6.7  Basic  properties  of  an  inner  product 

(a)  For  each  fixed  u  e  V,  the  function  that  takes  v  to  (v,  u)  is  a  linear 
map  from  V  to  F. 

(b)  (0,  u)  =0  for  every  u  e  V. 

(c)  (u,  0)  =0  for  every  u  e  V. 

(d)  (u,  v  +  w)  =  (w,  v)  +  (w,  w)  for  all  u,v,w  e  V. 

(e)  (u,  Av)  =  A  (u,  v)  for  all  A  e  F  and  u,  v  e  V. 

Proof 

(a)  Part  (a)  follows  from  the  conditions  of  additivity  in  the  first  slot  and 
homogeneity  in  the  first  slot  in  the  definition  of  an  inner  product. 

(b)  Part  (b)  follows  from  part  (a)  and  the  result  that  every  linear  map  takes 
0  to  0. 
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(c)  Part  (c)  follows  from  part  (a)  and  the  conjugate  symmetry  property  in 
the  definition  of  an  inner  product. 

(d)  Suppose  u,v,w  e  V.  Then 

( u ,  v  +  w)  =  (v  +  w,u) 

=  (v,  u)  +  (w,  u) 

=  (v,  u)  +  (w,  u) 

=  (w,  v)  +  ( u ,  w). 

(e)  Suppose  X  e  F  and  u,  v  e  F.  Then 

( u ,  Av)  =  (Av,  w) 

=  A(v,  w) 

=  A(v,  u) 

=  A  (n,  v), 

as  desired.  ■ 

Norms 

Our  motivation  for  defining  inner  products  came  initially  from  the  norms  of 
vectors  on  R2  and  R3.  Now  we  see  that  each  inner  product  determines  a 
norm. 


6.8  Definition  norm , 


v 


For  vef,  the  norm  of  v,  denoted  ||v||,  is  defined  by 


v 


=  yj  (v,  v). 


6.9  Example  norms 


(a)  If  (zi , . . . ,  zn)  G  Fw  (with  the  Euclidean  inner  product),  then 


(b) 


||  (Zl ,  .  .  .  ,  Zyi) 


zr  2  +  •  •  •  +  zn 


In  the  vector  space  of  continuous  real- valued  functions  on  [—1,  1]  [with 
inner  product  given  as  in  part  (c)  of  6.4],  we  have 


/ 
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6.10  Basic  properties  of  the  norm 
Suppose  v  e  V. 


(a) 


V 


=  0  if  and  only  if  v  =  0. 


(b)  ||Av||  =  | A |  ||v||  for  all  A  e  F. 


Proof 

(a)  The  desired  result  holds  because  (v,  v)  =  0  if  and  only  if  v  =  0. 

(b)  Suppose  A  e  F.  Then 

||Av||2  =  (Av,  Av) 

=  A(v,  Av) 

=  AA(v,  v) 

=  |A|2||v||2. 


Taking  square  roots  now  gives  the  desired  equality.  ■ 

The  proof  above  of  part  (b)  illustrates  a  general  principle:  working  with 
norms  squared  is  usually  easier  than  working  directly  with  norms. 

Now  we  come  to  a  crucial  definition. 


6.11  Definition  orthogonal 

Two  vectors  u,  v  e  V  are  called  orthogonal  if  (u,  v)  =  0. 


In  the  definition  above,  the  order  of  the  vectors  does  not  matter,  because 
(u,  v)  =  0  if  and  only  if  (v,  u)  =  0.  Instead  of  saying  that  u  and  v  are 
orthogonal,  sometimes  we  say  that  u  is  orthogonal  to  v. 

Exercise  13  asks  you  to  prove  that  if  u,  v  are  nonzero  vectors  in  R2,  then 


(w,  v) 


cos  9 , 


where  9  is  the  angle  between  u  and  v  (thinking  of  u  and  v  as  arrows  with  initial 
point  at  the  origin).  Thus  two  vectors  in  R2  are  orthogonal  (with  respect  to  the 
usual  Euclidean  inner  product)  if  and  only  if  the  cosine  of  the  angle  between 
them  is  0,  which  happens  if  and  only  if  the  vectors  are  perpendicular  in  the 
usual  sense  of  plane  geometry.  Thus  you  can  think  of  the  word  orthogonal  as 
a  fancy  word  meaning  perpendicular. 
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We  begin  our  study  of  orthogonality  with  an  easy  result. 


6.12  Orthogonality  and  0 

(a)  0  is  orthogonal  to  every  vector  in  V. 

(b)  0  is  the  only  vector  in  V  that  is  orthogonal  to  itself. 


Proof 

(a)  Part  (b)  of  6.7  states  that  (0,  u)  =  0  for  every  u  e  V. 

(b)  If  v  G  V  and  (v,  v)  =  0,  then  v  =  0  (by  definition  of  inner  product).  ■ 


The  word  orthogonal  comes  from 
the  Greek  word  orthogonios , 
which  means  right-angled. 

For  the  special  case  V  =  R2,  the 
next  theorem  is  over  2,500  years  old. 
Of  course,  the  proof  below  is  not  the 
original  proof. 


6.13  Pythagorean  Theorem 


Suppose  u  and  v  are  orthogonal  vectors  in  V.  Then 


U  - 1-  V 


+ 


V 


Proof  We  have 


u  Tv 


( u  v,  u  T  v) 

( u ,  u)  +  ( u ,  v)  +  (v,  u)  +  (v,  v) 

\\u\\2  +  imi2> 


as  desired. 


The  proof  given  above  of  the 
Pythagorean  Theorem  shows  that 
the  conclusion  holds  if  and  only 
if  (u,v)  +  (v,  u),  which  equals 
2R e(w,  v),  is  0.  Thus  the  converse 
of  the  Pythagorean  Theorem  holds 
in  real  inner  product  spaces. 


Suppose  u,v  e  V,  with  v  /  0.  We 
would  like  to  write  u  as  a  scalar  multiple 
of  v  plus  a  vector  w  orthogonal  to  v,  as 
suggested  in  the  next  picture. 
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u 


An  orthogonal  decomposition. 


To  discover  how  to  write  u  as  a  scalar  multiple  of  v  plus  a  vector  orthogonal 
to  v,  let  c  e  F  denote  a  scalar.  Then 


n  =  cv  +  (u  —  cv). 


Thus  we  need  to  choose  c  so  that  v  is  orthogonal  to  ( u  — 
we  want 


0  =  (n  —  cv,  v) 


(u,  v)  —  c 


v 


cv).  In  other  words, 


The  equation  above  shows  that  we  should  choose  c  to  be  (u,  v)/ 
this  choice  of  c,  we  can  write 


v 


.  Making 


(w,v)  (  (u,  v) 

u  =  I,  no  v  +  I  u  — n — uTv 

llvll  V  llvll 

As  you  should  verify,  the  equation  above  writes  u  as  a  scalar  multiple  of  v 
plus  a  vector  orthogonal  to  v.  In  other  words,  we  have  proved  the  following 
result. 


6.14  An  orthogonal  decomposition 

(u,v)  {u,v) 

Suppose  u,  v  e  V,  with  v  /  0.  Set  c  =  — —  -  and  w  =  u  — — -v.  Then 


IM 


V 


(w,  v)  =  0  and  u  =  cv  +  w. 


The  orthogonal  decomposition  6.14 
will  be  used  in  the  proof  of  the  Cauchy- 
Schwarz  Inequality,  which  is  our  next 
result  and  is  one  of  the  most  important 
inequalities  in  mathematics. 


French  mathematician  Augustin- 
Louis  Cauchy  ( 1789-1857 )  proved 
6.17(a)  in  1821.  German  mathe¬ 
matician  Hermann  Schwarz  (1 843- 
1921)  proved  6.17(b)  in  1886. 
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6.15  Cauchy-Schwarz  Inequality 
Suppose  u ,  v  e  V.  Then 


l(w,v)|  < 


u 


V 


This  inequality  is  an  equality  if  and  only  if  one  of  u,  v  is  a  scalar  multiple 
of  the  other. 

Proof  If  v  =  0,  then  both  sides  of  the  desired  inequality  equal  0.  Thus  we 
can  assume  that  v/0.  Consider  the  orthogonal  decomposition 

(u,v) 


u  = 


■v  +  w 


V 


given  by  6.14,  where  w  is  orthogonal  to  v.  By  the  Pythagorean  Theorem, 


u 


(u,v) 

IMI2 

(w,  v)| 


+ 


w 


+ 


V 


W 


6.16 


> 


|(n,v)| 


V 


V 


and  then  taking  square  roots 


Multiplying  both  sides  of  this  inequality  by 
gives  the  desired  inequality. 

Looking  at  the  proof  in  the  paragraph  above,  note  that  the  Cauchy-Schwarz 
Inequality  is  an  equality  if  and  only  if  6.16  is  an  equality.  Obviously  this 
happens  if  and  only  if  w  =  0.  But  w  =  0  if  and  only  if  u  is  a  multiple  of  v 
(see  6.14).  Thus  the  Cauchy-Schwarz  Inequality  is  an  equality  if  and  only  if 
u  is  a  scalar  multiple  of  v  or  v  is  a  scalar  multiple  of  u  (or  both;  the  phrasing 
has  been  chosen  to  cover  cases  in  which  either  u  or  v  equals  0).  ■ 


6.17  Example  examples  of  the  Cauchy-Schwarz  Inequality 


(a)  If  Xi , . . . ,  xn ,  yi , . . . ,  yn  e  R,  then 

\xiyi  +  •  •  •  +  xnyn \2  <  (xi2  +  •  •  •  +  xn2)(y\2  +  •  •  •  +  yn2)- 

(b)  If  f  g  are  continuous  real- valued  functions  on  [—1,  1],  then 

J  f(x)g(x)  dx  -{J  (f(x))2  dx)(f  (#(x))2  dx)  ■ 


SECTION  6.  A  Inner  Products  and  Norms 


173 


The  next  result,  called  the  Triangle 
Inequality,  has  the  geometric  interpreta¬ 
tion  that  the  length  of  each  side  of  a  tri¬ 
angle  is  less  than  the  sum  of  the  lengths 
of  the  other  two  sides. 

Note  that  the  Triangle  Inequality  im¬ 
plies  that  the  shortest  path  between  two 
points  is  a  line  segment. 

6.18  Triangle  Inequality 
Suppose  u,  v  e  V.  Then 


U  - 1-  V 


< 


u  + 


V 


This  inequality  is  an  equality  if  and  only  if  one  of  u,  v  is  a  nonnegative 
multiple  of  the  other. 


Proof  We  have 


U  - h  V 


( u  v,  u  -Tv) 

(u,  u)  +  (v,  v)  +  ( u ,  v)  +  (v,  u) 


—  (u,u)  +  (v,  v)  +  ( u,v )  +  (u,v) 


— 

\u\ 

I2  + 

|v 

|2  +  2R e(w,  v) 

6.19 

< 

u 

2  + 

V 

2  +  2|  (u,  v)  | 

6.20 

< 

u 

2  + 

V 

2  +  2 1| u  ||  || v 

=  ( 

u 

+ 

V 

)2, 

where  6.20  follows  from  the  Cauchy-Schwarz  Inequality  (6.15).  Taking 
square  roots  of  both  sides  of  the  inequality  above  gives  the  desired  inequality. 

The  proof  above  shows  that  the  Triangle  Inequality  is  an  equality  if  and 
only  if  we  have  equality  in  6.19  and  6.20.  Thus  we  have  equality  in  the 
Triangle  Inequality  if  and  only  if 


6.21 


If  one  of  u,v  is  a  nonnegative  multiple  of  the  other,  then  6.21  holds,  as 
you  should  verify.  Conversely,  suppose  6.21  holds.  Then  the  condition  for 
equality  in  the  Cauchy-Schwarz  Inequality  (6.15)  implies  that  one  of  u,  v  is  a 
scalar  multiple  of  the  other.  Clearly  6.21  forces  the  scalar  in  question  to  be 
nonnegative,  as  desired.  ■ 
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The  next  result  is  called  the  parallelogram  equality  because  of  its  geometric 
interpretation:  in  every  parallelogram,  the  sum  of  the  squares  of  the  lengths 
of  the  diagonals  equals  the  sum  of  the  squares  of  the  lengths  of  the  four  sides. 


u 


The  parallelogram  equality. 


6.22  Parallelogram  Equality 

Suppose  u,  v  G  V.  Then 


U  - b  V  + 


u  —  V 


—  2(11  u  |  ^  +  v|p). 


Proof  We  have 


U  - 1-  V  H- 


U  —  V 


[u  -p  v,  u  T-  v)  T-  (u  —  v,  u  —  v) 
||w|P  +  |  v  || 2  +  ( u,v )  +  ( v,u ) 

+  \\u\\  +  || v||  —  (u,v)  —  (v,  u) 
2(N|2  +  ||v||2), 


as  desired. 


Law  professor  Richard  Friedman  presenting  a  case  before  the  LJ.S. 
Supreme  Court  in  2010: 

Mr.  Friedman:  I  think  that  issue  is  entirely  orthogonal  to  the  issue  here 

because  the  Commonwealth  is  acknowledging — 

Chief  Justice  Roberts :  I’m  sorry.  Entirely  what? 

Mr.  Friedman:  Orthogonal.  Right  angle.  Unrelated.  Irrelevant. 

Chief  Justice  Roberts :  Oh. 

Justice  Scalia:  What  was  that  adjective?  I  liked  that. 

Mr.  Friedman:  Orthogonal. 

Chief  Justice  Roberts :  Orthogonal. 

Mr.  Friedman:  Right,  right. 

Justice  Scalia :  Orthogonal,  ooh.  (Laughter.) 

Justice  Kennedy:  I  knew  this  case  presented  us  a  problem.  (Laughter.) 
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EXERCISES  6. A 


1  Show  that  the  function  that  takes  ((xi,X2),  {y\,  J2))  g  R2  x  R2  to 
\x\  y\  |  +  IX2J2I  is  not  an  inner  product  on  R2. 

2  Show  that  the  function  that  takes  ((xi ,  X2,  X3),  (yi ,  y 2,  J3))  G  R3  x  R3 
to  x\y\  +  X3  J3  is  not  an  inner  product  on  R3. 

3  Suppose  F  =  R  and  V  7^  {0}.  Replace  the  positivity  condition  (which 
states  that  (v,  v)  >  0  for  all  v  G  V)  in  the  definition  of  an  inner  product 
(6.3)  with  the  condition  that  (v,  v)  >  0  for  some  v  G  V.  Show  that  this 
change  in  the  definition  does  not  change  the  set  of  functions  from  V  x  V 
to  R  that  are  inner  products  on  V. 


4  Suppose  V  is  a  real  inner  product  space. 


(a) 

(b) 

(c) 


Show  that  (u  +  v,  u  —  v)  =  \\u 


v 


for  every  u,  v  G  V. 


Show  that  if  u,  v  e  V  have  the  same  norm,  then  u + v  is  orthogonal 
to  u  —  V. 

Use  part  (b)  to  show  that  the  diagonals  of  a  rhombus  are  perpen¬ 
dicular  to  each  other. 


5 

6 


Suppose  T  G  C(V)  is  such  that  ||!Tv||  <  ||v||  for  every  v  G  V.  Prove  that 
T  —  *JlI  is  invertible. 


Suppose  u,  v  G  V.  Prove  that  (u,  v)  =  0  if  and  only  if 


u 


< 


u  +  av 


for  all  a  G  F. 


7  Suppose  ii,vg  V.  Prove  that  ||  au  +  bv  ||  =  ||  bu  +  av  ||  for  all  a,b  e  R 


if  and  only  if  \\u 


v  . 


8  Suppose  u,  v  G  V  and  \\u 


v 


=  1  and  (u,  v)  =  1.  Prove  that  u  —v. 


9  Suppose  u,  v  G  V  and  \\u\\  <  1  and  ||v||  <  1.  Prove  that 


v 


<  1 


-  I(w,v}|. 


10  Find  vectors  u,  v  e  R2  such  that  u  is  a  scalar  multiple  of  (1,  3),  v  is 
orthogonal  to  (1, 3),  and  (1,  2)  =  u  +  v. 
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11  Prove  that 


16  <  ($+Z?  +  c  +  r/)(  — b  7 — I - 1 — 7 

abed 


for  all  positive  numbers  a,b,c,d. 


12  Prove  that 


{x\  +  •  •  •  +  xw)2  <  n(x i2  +  •  •  •  +  xw2) 


for  all  positive  integers  n  and  all  real  numbers  x\, ...  ,xn. 
13  Suppose  u,  v  are  nonzero  vectors  in  R2.  Prove  that 


(u,v)  = 


u 


V 


cos  9. 


14 


where  9  is  the  angle  between  u  and  v  (thinking  of  u  and  v  as  arrows  with 
initial  point  at  the  origin). 

Hint :  Draw  the  triangle  formed  by  u,  v,  and  u  —  v;  then  use  the  law  of 
cosines. 

The  angle  between  two  vectors  (thought  of  as  arrows  with  initial  point  at 
the  origin)  in  R2  or  R3  can  be  defined  geometrically.  However,  geometry 
is  not  as  clear  in  R"  for  n  >  3.  Thus  the  angle  between  two  nonzero 
vectors  x,y  e  Kn  is  defined  to  be 


arccos 


{x,y) 


X 


where  the  motivation  for  this  definition  comes  from  the  previous  exercise. 
Explain  why  the  Cauchy-Schwarz  Inequality  is  needed  to  show  that  this 
definition  makes  sense. 


15  Prove  that 


n 


n 


n 


7=1  7=1 


7  =  1 


for  all  real  numbers  a\, ...  ,an  and  b\, ...  ,bn. 
16  Suppose  u,v  e  V  are  such  that 


u  =3,  u  +  v  =4,  u  —  v  =6. 


What  number  does  ||v||  equal? 
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Prove  or  disprove:  there  is  an  inner  product  on  R2  such  that  the  associated 
norm  is  given  by 

110,30  II  =  max{x,  y} 

for  all  (x,  y)  e  R2. 

Suppose  p  >  0.  Prove  that  there  is  an  inner  product  on  R2  such  that  the 
associated  norm  is  given  by 

\\(x,y)\\  =  (xP  +  yP)llp 


for  all  (x,  y)  e  R2  if  and  only  if  p  =  2. 


Suppose  V  is  a  real  inner  product  space.  Prove  that 


(w,v)  = 


u  V 


u  —  V 


for  all  u,  v  G  V. 


20  Suppose  V  is  a  complex  inner  product  space.  Prove  that 

\\u  4-  v II 2  —  u  —  v II 2  -1-  \\u  4-  i v  I2 

(m,  v)  = 


l  — 


u  —  IV 


for  all  u,  v  G  V. 


21  A  norm  on  a  vector  space  U  is  a  function  ||  || :  U  ->  [0,  oo)  such 
that  \\u\\  =  0  if  and  only  if  u  =  0,  ||arw||  =  |a'|||w||  for  all  a  e  F 
and  all  u  e  U,  and  \\u  +  v||  <  \\u\\  +  ||v||  for  all  u,v  e  U.  Prove 
that  a  norm  satisfying  the  parallelogram  equality  comes  from  an  inner 
product  (in  other  words,  show  that  if  ||  ||  is  a  norm  on  U  satisfying  the 
parallelogram  equality,  then  there  is  an  inner  product  (  ,  )  on  U  such 
that  \\u\\  =  (w,  u )1//2  for  all  u  e  U ). 


22  Show  that  the  square  of  an  average  is  less  than  or  equal  to  the  average 
of  the  squares.  More  precisely,  show  that  if  a\, . . . ,  an  e  R,  then  the 
square  of  the  average  of  a\ , . . . ,  an  is  less  than  or  equal  to  the  average 
of  a i2, . . . ,  an2. 


23  Suppose  V\ , . . . ,  Vm  are  inner  product  spaces.  Show  that  the  equation 
((l/ 1 ,  .  .  .  ,11^,  (vi,  .  .  .  ,  Vfyi))  —  (u  1 ,  Vi  )  +•••+  ( um ,  vm) 
defines  an  inner  product  on  V\  x  •  •  •  x  Vm . 

[In  the  expression  above  on  the  right ,  (u  \ ,  v\ )  denotes  the  inner  product 
onV i,  . . . ,  {um,vm)  denotes  the  inner  product  on  Vm.  Each  of  the  spaces 
V\ , . . . ,  Vm  may  have  a  different  inner  product ,  even  though  the  same 
notation  is  used  here.] 
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24  Suppose  S  E  £(F)  is  an  injective  operator  on  F  Define  (•,  *)i  by 

( u ,  v)i  =  ( Su ,  Sv) 

for  u,  v  E  F.  Show  that  (•,  *)i  is  an  inner  product  on  F. 

25  Suppose  S  E  £(F)  is  not  injective.  Define  (•,  •)  \  as  in  the  exercise  above. 
Explain  why  (•,  •)  i  is  not  an  inner  product  on  F 

26  Suppose  /,  g  are  differentiable  functions  from  R  to  . 


(a)  Show  that 

(f(t),g(t)Y  =  (f'(t),g(t))  +  (fit),  fit)). 

(b)  Suppose  c  >  0  and  ||/(0II  =  c  for  every  t  e  R.  Show  that 
(/'(0>  /( 0)  =  0  for  every  t  e  R. 

(c)  Interpret  the  result  in  part  (b)  geometrically  in  terms  of  the  tangent 
vector  to  a  curve  lying  on  a  sphere  in  Kn  centered  at  the  origin. 


[For  the  exercise  above,  a  function  f  :  R  — >  Kn  is  called  differentiable 
if  there  exist  differentiable  functions  f\ , . . . ,  fn  from  R  to  R  such  that 
fit)  =  (flit),...,  fn  it))  for  each  t  €  R.  Furthermore,  for  each  t  e  R, 
the  derivative  f'ft )  E  R"  is  defined  by  f'(t )  =  (//(7), . . . ,  /?/ (/))•] 


27  Suppose  u,v,w  E  F.  Prove  that 


w  -  +  v)||2  = 


w  —  w  z  + 


w  —  V 


u  —  v 


Suppose  C  is  a  subset  of  F  with  the  property  that  u,v  E  C  implies 
+  v)  E  C.  Let  w  E  F.  Show  that  there  is  at  most  one  point  in  C 
that  is  closest  to  w.  In  other  words,  show  that  there  is  at  most  one  u  E  C 
such  that 


w  —  u 


< 


w  —  V 


for  all  v  E  C . 


Hint:  Use  the  previous  exercise. 


29  For  u,v  E  F,  define  r/(w,  v)  =  || u  —  v 


(a)  Show  that  d  is  a  metric  on  F 

(b)  Show  that  if  F  is  finite-dimensional,  then  d  is  a  complete  metric 
on  F  (meaning  that  every  Cauchy  sequence  converges). 

(c)  Show  that  every  finite-dimensional  subspace  of  F  is  a  closed 
subset  of  F  (with  respect  to  the  metric  d). 
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Fix  a  positive  integer  n.  The  Laplacian  Ap  of  a  twice  differentiable 
function  p  on  Rn  is  the  function  on  Rn  defined  by 


A  32P 

AP~JP  + 


+ 


3  2  p 

3x2 


The  function  p  is  called  harmonic  if  Ap  =  0. 


A  polynomial  on  R”  is  a  linear  combination  of  functions  of  the 
form  ximi  •  •  •  xnmn ,  where  m\, . . . ,  mn  are  nonnegative  integers. 

Suppose  q  is  a  polynomial  on  Rn .  Prove  that  there  exists  a  harmonic 
polynomial  p  on  R"  such  that  p(x)  =  q(x)  for  every  x  e  Rn  with 
||x  ||  =  1. 

[ The  only  fact  about  harmonic  functions  that  you  need  for  this  exercise 
is  that  if  p  is  a  harmonic  function  on  and  p{x)  =  0  for  all  x  e  Rn 
with  ||x ||  =  1,  then  p  =  0.] 


Hint:  A  reasonable  guess  is  that  the  desired  harmonic  polynomial  p  is  of 
the  form  q  +  (1  —  ||x  ||2)r  for  some  polynomial  r .  Prove  that  there  is  a 
polynomial  r  on  Rn  such  that  q  +  (1  —  ||x||2)r  is  harmonic  by  defining 
an  operator  T  on  a  suitable  vector  space  by 

Tr  —  A((l  —  ||x||2)r) 

and  then  showing  that  T  is  injective  and  hence  surjective. 


31  Use  inner  products  to  prove  Apollonius’s  Identity:  In  a  triangle  with 
sides  of  length  a,  b,  and  c,  let  d  be  the  length  of  the  line  segment  from 
the  midpoint  of  the  side  of  length  c  to  the  opposite  vertex.  Then 

a2  +  b2  =  jc2  +  2d2. 


c 
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6.B 


Orthonormal  Bases 


6.23  Definition  orthonormal 

•  A  list  of  vectors  is  called  orthonormal  if  each  vector  in  the  list  has 
norm  1  and  is  orthogonal  to  all  the  other  vectors  in  the  list. 

•  In  other  words,  a  list  e\, ...  ,em  of  vectors  in  V  is  orthonormal  if 


{ej,ek) 


( 1  if  j  =  k, 
jo  if  j  7^  k. 


6.24  Example  orthonormal  lists 

(a)  The  standard  basis  in  Fw  is  an  orthonormal  list. 

(b)  (A-  A  A)»  (- A  °)  is  an  orthonormal  list  in  F3. 

(c)  (73' ■  71’  7f)’  ("75’  75’ °)’  <75’  75’  -  ^)  is  an  orthonormal  list 
in  F3. 


Orthonormal  lists  are  particularly  easy  to  work  with,  as  illustrated  by  the 
next  result. 


6.25  The  norm  of  an  orthonormal  linear  combination 
If  e\ , . . . ,  em  is  an  orthonormal  list  of  vectors  in  V,  then 


a\e\  +  •  •  •  +  ame 


m 


a\  z  +  ---  + 


a 


m 


for  all  a\, . . . ,  am  e  F. 


Proof  Because  each  ej  has  norm  1,  this  follows  easily  from  repeated  appli¬ 
cations  of  the  Pythagorean  Theorem  (6.13).  ■ 

The  result  above  has  the  following  important  corollary. 


6.26  An  orthonormal  list  is  linearly  independent 
Every  orthonormal  list  of  vectors  is  linearly  independent. 
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Proof  Suppose  is  an  orthonormal  list  of  vectors  in  V  and 

a  i , . . . ,  am  G  F  are  such  that 


a\e\  +  •  •  •  +  amem  —  0. 


Then  \a\  z  +  •  •  •  +  \am  |2  =  0  (by  6.25),  which  means  that  all  the  aj  ’s  are  0. 
Thus  e\ , . . . ,  em  is  linearly  independent.  ■ 


6.27  Definition  orthonormal  basis 

An  orthonormal  basis  of  V  is  an  orthonormal  list  of  vectors  in  V  that  is 
also  a  basis  of  V. 

For  example,  the  standard  basis  is  an  orthonormal  basis  of  Fn . 

6.28  An  orthonormal  list  of  the  right  length  is  an  orthonormal  basis 

Every  orthonormal  list  of  vectors  in  V  with  length  dim  V  is  an  orthonormal 
basis  of  V. 

Proof  By  6.26,  any  such  list  must  be  linearly  independent;  because  it  has  the 
right  length,  it  is  a  basis — see  2.39.  ■ 


6.29  Example  Show  that 

(1  1  1  i)  (i  i  _I  _I\  (l  _I  _I  l)  (_I  i  _I  i) 

\2’  2’  2’  2/’  V2’  2’  2’  2/’\2’  2’  2’2/’V  2’2’  2’  2/ 

is  an  orthonormal  basis  of  F4. 


Solution  We  have 

I l(i’  b  b  1)1  =  y/(b)  +  (1)  +  (1)  +  (1)  =  L 

Similarly,  the  other  three  vectors  in  the  list  above  also  have  norm  1 . 
We  have 


l  l 

2’  2’ 


1 

2’ 


1 

2’ 


_I)\  =  I.I  +  I.  I  +  !.(_!)  + 

2/1  2  2  ^  2  2  ^  2  V  2/  ^ 


Similarly,  the  inner  product  of  any  two  distinct  vectors  in  the  list  above  also 
equals  0. 

Thus  the  list  above  is  orthonormal.  Because  we  have  an  orthonormal  list  of 
length  four  in  the  four-dimensional  vector  space  F4,  this  list  is  an  orthonormal 
basis  of  F4  (by  6.28). 
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In  general,  given  a  basis  e\, ...  ,en  of  V  and  a  vector  v  E  V,  we  know  that 
there  is  some  choice  of  scalars  a\, ...  ,an  E  F  such  that 


v  =  a \e\ 


The  importance  of  orthonormal 
bases  stems  mainly  from  the  next 
result. 

+  *  *  *  +  dnen- 

Computing  the  numbers  a\, ...  ,an  that 
satisfy  the  equation  above  can  be  diffi¬ 
cult  for  an  arbitrary  basis  of  V.  The 
next  result  shows,  however,  that  this  is 
easy  for  an  orthonormal  basis — just  take 
dj  =  (v,ej). 


6.30  Writing  a  vector  as  linear  combination  of  orthonormal  basis 

Suppose  e\ , . . . ,  en  is  an  orthonormal  basis  of  V  and  v  E  V.  Then 

v  =  (v,ei)ei  H - b  {v,en)en 


and 


I(v,pi) 


2  +  •  •  •  +  |  (v,  en) 


Proof  Because  e\ , . . . ,  en  is  a  basis  of  V,  there  exist  scalars  a\, ...  ,an  such 
that 

v  =  d\e\  anen . 

Because  e\, . . . ,  en  is  orthonormal,  taking  the  inner  product  of  both  sides  of 
this  equation  with  ej  gives  (v,  ej )  —  a j.  Thus  the  first  equation  in  6.30  holds. 

The  second  equation  in  6.30  follows  immediately  from  the  first  equation 
and  6.25.  ■ 

Now  that  we  understand  the  usefulness  of  orthonormal  bases,  how  do  we 
go  about  finding  them?  For  example,  does  PW(R),  with  inner  product  given 
by  integration  on  [—1,  1]  [see  6.4(c)],  have  an  orthonormal  basis?  The  next 
result  will  lead  to  answers  to  these  questions. 

The  algorithm  used  in  the  next  proof 
is  called  the  Gram-Schmidt  Procedure . 
It  gives  a  method  for  turning  a  linearly 
independent  list  into  an  orthonormal  list 
with  the  same  span  as  the  original  list. 


Danish  mathematician  J0rgen 
Gram  (1850-1916)  and  German 
mathematician  Erhard  Schmidt 
(1876-1959)  popularized  this  algo¬ 
rithm  that  constructs  orthonormal 
lists. 
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6.31  Gram-Schmidt  Procedure 


Suppose  vi, . . . ,  vm 
e\  =  vi  / 1|  vi  || .  For  j 


is  a  linearly  independent  list  of  vectors  in  V.  Let 
=  2, . . . ,  m,  define  ej  inductively  by 


_  v j  -  {vj,e p£i - {Vj,ej-i)ej-i 

J  II v j  -  {vj,e i)e\ - (vj,ej-i)ej-i  \  ' 

Then  e\ , . . . ,  em  is  an  orthonormal  list  of  vectors  in  V  such  that 


span(vi, . . . ,  Vj)  =  span(^i, . . . ,  ej) 


for  j  =  1, . . . ,  m. 

Proof  We  will  show  by  induction  on  j  that  the  desired  conclusion  holds.  To 
get  started  with  j  —  1,  note  that  span(vi)  =  span(^i)  because  v\  is  a  positive 
multiple  of  e\ . 

Suppose  1  <  j  <  m  and  we  have  verified  that 

6.32  span(vi, . . . ,  v/_i)  =  span(^i, . . . ,  ej- 1). 

Note  that  vj  £  span(vi , . . . ,  v7_ i)  (because  v\ , . . . ,  vm  is  linearly  indepen¬ 
dent).  Thus  Vj  £  span(^i , . . . ,  £/-i).  Hence  we  are  not  dividing  by  0  in  the 
definition  of  ej  given  in  6.31.  Dividing  a  vector  by  its  norm  produces  a  new 
vector  with  norm  1;  thus  \\ej  ||  =  1. 

Let  1  <  k  <  j .  Then 

vj-(vJ’e \)e\ - ( F/ 1  e j  l ) ej  i 

\^j  ’  )  \  /  \  /  \ 

\  Vj  -  [Vj,ei)ei - ( vj,ej-i)ej-i 

_  _ .  gfc )  ~  (Vj  .  ek  ) _ 

I vy  -  {vj,e i)ei - (vj,ej-i)ej-i\ 

=  0. 

Thus  e\ , . . . ,  ej  is  an  orthonormal  list. 

From  the  definition  of  ej  given  in  6.31,  we  see  that  Vj  e  span(^i, . . . ,  ej). 
Combining  this  information  with  6.32  shows  that 

span(vi, . . . ,  Vj)  C  span(^i, . . . ,  ej). 

Both  lists  above  are  linearly  independent  (the  v’s  by  hypothesis,  the  e’s  by 
orthonormality  and  6.26).  Thus  both  subspaces  above  have  dimension  /,  and 
hence  they  are  equal,  completing  the  proof.  ■ 
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6.33  Example  Find  an  orthonormal  basis  of  V2(R),  where  the  inner  prod¬ 
uct  is  given  by  (p,  q )  =  f\  p(x)q(x)  dx. 


Solution  We  will  apply  the  Gram-Schmidt  Procedure  (6.31)  to  the  basis 

l,x,  x2. 

To  get  started,  with  this  inner  product  we  have 


Thus  ||  1 1|  =  V2,  and  hence  e\  = 

Now  the  numerator  in  the  expression  for  e2  is 


x  —  (x,  e\)e\  —  x  — 


We  have 


x 


/; 


x2  dx  — 


Thus  ||x  ||  =  and  hence  C2  = 

Now  the  numerator  in  the  expression  for  is 


i 

x  -  (x  ,  e\)e\  -  (x  ,  e2)e2 

l 

x 


x2  — 


x2-  i. 


-1 


2  i  dx 


X‘ 


-l 


|x  dx 


We  have 


is  an  orthonormal  list  of  length  3  in  V2  (R) .  Hence  this  orthonormal  list  is  an 
orthonormal  basis  of  V2(R)  by  6.28. 
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Now  we  can  answer  the  question  about  the  existence  of  orthonormal  bases. 

6.34  Existence  of  orthonormal  basis 

Every  finite-dimensional  inner  product  space  has  an  orthonormal  basis. 

Proof  Suppose  V  is  finite-dimensional.  Choose  a  basis  of  V.  Apply  the 
Gram-Schmidt  Procedure  (6.31)  to  it,  producing  an  orthonormal  list  with 
length  dim  V.  By  6.28,  this  orthonormal  list  is  an  orthonormal  basis  of  V.  m 

Sometimes  we  need  to  know  not  only  that  an  orthonormal  basis  exists,  but 
also  that  every  orthonormal  list  can  be  extended  to  an  orthonormal  basis.  In 
the  next  corollary,  the  Gram-Schmidt  Procedure  shows  that  such  an  extension 
is  always  possible. 

6.35  Orthonormal  list  extends  to  orthonormal  basis 

Suppose  V  is  finite-dimensional.  Then  every  orthonormal  list  of  vectors 
in  V  can  be  extended  to  an  orthonormal  basis  of  V. 

Proof  Suppose  e\, ...  ,em  is  an  orthonormal  list  of  vectors  in  V.  Then 
e\ , . . . ,  em  is  linearly  independent  (by  6.26).  Hence  this  list  can  be  extended  to 
a  basis  e\ , . . . ,  em,  v\ , . . . ,  vn  ofV  (see  2.33).  Now  apply  the  Gram-Schmidt 
Procedure  (6.31)  to  e\ , . . . ,  em,  v\ , . . . ,  vn,  producing  an  orthonormal  list 

6.36  c\ , . . . ,  em ,  f\, . . . ,  fn\ 

here  the  formula  given  by  the  Gram-Schmidt  Procedure  leaves  the  first  m 
vectors  unchanged  because  they  are  already  orthonormal.  The  list  above  is  an 
orthonormal  basis  of  V  by  6.28.  ■ 

Recall  that  a  matrix  is  called  upper  triangular  if  all  entries  below  the 
diagonal  equal  0.  In  other  words,  an  upper-triangular  matrix  looks  like  this: 

/  *  *  \ 

* .  ? 

0  *  ) 


where  the  0  in  the  matrix  above  indicates  that  all  entries  below  the  diagonal 
equal  0,  and  asterisks  are  used  to  denote  entries  on  and  above  the  diagonal. 
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In  the  last  chapter  we  showed  that  if  V  is  a  finite-dimensional  complex 
vector  space,  then  for  each  operator  on  V  there  is  a  basis  with  respect  to 
which  the  matrix  of  the  operator  is  upper  triangular  (see  5.27).  Now  that  we 
are  dealing  with  inner  product  spaces,  we  would  like  to  know  whether  there 
exists  an  orthonormal  basis  with  respect  to  which  we  have  an  upper-triangular 
matrix. 

The  next  result  shows  that  the  existence  of  a  basis  with  respect  to  which 
T  has  an  upper-triangular  matrix  implies  the  existence  of  an  orthonormal 
basis  with  this  property.  This  result  is  true  on  both  real  and  complex  vector 
spaces  (although  on  a  real  vector  space,  the  hypothesis  holds  only  for  some 
operators). 

6.37  Upper-triangular  matrix  with  respect  to  orthonormal  basis 

Suppose  T  e  C(V).  If  T  has  an  upper- triangular  matrix  with  respect  to 
some  basis  of  V,  then  T  has  an  upper-triangular  matrix  with  respect  to 
some  orthonormal  basis  of  V. 

Proof  Suppose  T  has  an  upper-triangular  matrix  with  respect  to  some  basis 
vi, . . . ,  vn  of  V.  Thus  span(vi, . . . ,  Vj)  is  invariant  under  T  for  each  j  = 
1, . . . ,  n  (see  5.26). 

Apply  the  Gram-Schmidt  Procedure  to  v\ , . . . ,  vn,  producing  an  orthonor¬ 
mal  basis  ei,...,enofV.  Because 

span(ei, . . .  ,ej)  =  span(vi, . . . ,  Vj) 

for  each  j  (see  6.31),  we  conclude  that  span(ci , . . . ,  ej)  is  invariant  under  T 
for  each  j  =  1, . . . ,  n.  Thus,  by  5.26,  T  has  an  upper-triangular  matrix  with 
respect  to  the  orthonormal  basis  e\ , . . . ,  en .  ■ 

The  next  result  is  an  important  appli¬ 
cation  of  the  result  above. 


German  mathematician  Issai  Schur 
(. 1875-1941 )  published  the  first 
proof  of  the  next  result  in  1909. 


6.38  Schur’s  Theorem 

Suppose  V  is  a  finite-dimensional  complex  vector  space  and  T  e  C(V). 
Then  T  has  an  upper-triangular  matrix  with  respect  to  some  orthonormal 
basis  of  V. 

Proof  Recall  that  T  has  an  upper-triangular  matrix  with  respect  to  some  basis 
of  V  (see  5.27).  Now  apply  6.37.  ■ 
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Linear  Functionals  on  Inner  Product  Spaces 

Because  linear  maps  into  the  scalar  field  F  play  a  special  role,  we  defined  a 
special  name  for  them  in  Section  3.F.  That  definition  is  repeated  below  in 
case  you  skipped  Section  3.F. 

6.39  Definition  linear  functional 

A  linear  functional  on  V  is  a  linear  map  from  V  to  F.  In  other  words,  a 
linear  functional  is  an  element  of  C(V,  F). 


6.40  Example  The  function  cp :  F3  ->  F  defined  by 

<p(z  i,z2,  z  3)  =  2z\  —  5z2  + 

is  a  linear  functional  on  F3.  We  could  write  this  linear  functional  in  the  form 

<p(z)  =  (z,u) 

for  every  z  e  F3,  where  u  =  (2,  —5, 1). 


6.41  Example  The  function  cp :  V2 (R)  ->  R  defined  by 

<P(P)  =  /_  /7(/)(cOS(7T/))  dt 

is  a  linear  functional  on  V2(R)  (here  the  inner  product  on  V2 (R)  is  multi¬ 
plication  followed  by  integration  on  [—1,  1];  see  6.33).  It  is  not  obvious  that 
there  exists  u  e  V2  (R)  such  that 

<P(P)  =  iP’u) 

for  every  p  e  V2(R)  [we  cannot  take  u(t)  =  cos(7r/)  because  that  function 
is  not  an  element  of  ^(R)]- 


If  u  e  V,  then  the  map  that  sends 
v  to  (v,  u)  is  a  linear  functional  on  V. 
The  next  result  shows  that  every  linear 
functional  on  V  is  of  this  form.  Ex¬ 
ample  6.41  above  illustrates  the  power 
of  the  next  result  because  for  the  linear 
functional  in  that  example,  there  is  no 
obvious  candidate  for  u. 


The  next  result  is  named  in  honor  of 
Hungarian  mathematician  Frigyes 
Riesz  ( 1880-1956 ),  who  proved 
several  results  early  in  the  twen¬ 
tieth  century  that  look  very  much 
like  the  result  below. 
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6.42  Riesz  Representation  Theorem 

Suppose  V  is  finite-dimensional  and  cp  is  a  linear  functional  on  V.  Then 
there  is  a  unique  vector  u  G  V  such  that 

<p(v)  =  (v,u) 


for  every  v  e  V. 

Proof  First  we  show  there  exists  a  vector  u  G  V  such  that  <p(v)  =  (v,u)  for 
every  v  G  V.  Let  e\ , . . . ,  en  be  an  orthonormal  basis  of  V.  Then 

<p(v)  =  cp({v,e i)ei  H - h  < v,e„)e„ ) 

=  (v,e\)<p(e\)  H - b  < v,en)(p(en ) 

=  (v,  (p(ei)ei  H - 1-  (p(e„)e„) 

for  every  v  e  V,  where  the  first  equality  comes  from  6.30.  Thus  setting 

6.43  u  =  cp (ei)ei  H - h  (p(en)en, 

we  have  cp(v)  =  (v,  u)  for  every  v  e  V,  as  desired. 

Now  we  prove  that  only  one  vector  u  e  V  has  the  desired  behavior. 
Suppose  u\,U2  c  V  are  such  that 

<p(v)  =  (v,  Ui)  =  (v,m2) 


for  every  v  e  V.  Then 

0  =  (v,  u\)  -  (v,  u2)  =  (v,  u\  -  u2) 

for  every  v  e  V.  Taking  v  =  u  \  —  u2  shows  that  u\—u2  =  0.  In  other  words, 
u\  =  u2,  completing  the  proof  of  the  uniqueness  part  of  the  result.  ■ 


6.44  Example  Find  u  G  V2(R)  such  that 


J  p(t)(cos(jrt))  dt  =  J  p(t)u(t)dt 


for  every  p  g  V2(R). 
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Solution  Let  <p(p)  =  f\  p  (t)  (cos  (ict))  dt.  Applying  formula  6.43  from 
the  proof  above,  and  using  the  orthonormal  basis  from  Example  6.33,  we  have 


+ 


yff  (t2  ~  i)(cOS(7T/))  dt)  y[f  (X2  -  |). 


A  bit  of  calculus  shows  that 


u(x) 


Suppose  V  is  finite-dimensional  and  cp  a  linear  functional  on  V.  Then  6.43 
gives  a  formula  for  the  vector  u  that  satisfies  cp(v)  =  (v,u)  for  all  v  e  V. 
Specifically,  we  have 


u  =  <p(e \)e\  4 - h  <p(en)en. 

The  right  side  of  the  equation  above  seems  to  depend  on  the  orthonormal 
basis  e\, . . . ,  en  as  well  as  on  cp.  However,  6.42  tells  us  that  u  is  uniquely 
determined  by  cp.  Thus  the  right  side  of  the  equation  above  is  the  same 
regardless  of  which  orthonormal  basis  e\,...,en  of  V  is  chosen. 

EXERCISES  6.B 


1  (a)  Suppose  0  e  R.  Show  that  (cos  9,  sin  9),  (—  sin  9,  cos  0)  and 

(cos  9 ,  sin  9),  (sin  9 ,  —  cos  9)  are  orthonormal  bases  of  R1 2. 

(b)  Show  that  each  orthonormal  basis  of  R2  is  of  the  form  given  by 
one  of  the  two  possibilities  of  part  (a). 


2  Suppose  e\, . . . ,  em  is  an  orthonormal  list  of  vectors  in  V.  Let  v  e  V. 
Prove  that 

v||2  =  I <v, ei ) I2  H - b  \(v,em)  2 


if  and  only  if  v  e  span(^i , . . . ,  em). 


3  Suppose  T  e  £(R3)  has  an  upper- triangular  matrix  with  respect  to 
the  basis  (1,  0,  0),  (1,  1,  1),  (1,  1,2).  Find  an  orthonormal  basis  of  R3 
(use  the  usual  inner  product  on  R3)  with  respect  to  which  T  has  an 
upper- triangular  matrix. 
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4  Suppose  n  is  a  positive  integer.  Prove  that 


1  cosx  cos2x  cos  fix  sinx  sin2x  sinzzx 


is  an  orthonormal  list  of  vectors  in  C  [— n,  n ],  the  vector  space  of  contin¬ 
uous  real-valued  functions  on  [—n,  n]  with  inner  product 

</  g)  =  [  f(x)g(x)dx. 

J  —71 

[The  orthonormal  list  above  is  often  used  for  modeling  periodic  phenom¬ 
ena  such  as  tides.] 

5  On  P2(R),  consider  the  inner  product  given  by 

(p,q)  =  /  p(x)q(x)dx. 

Jo 

Apply  the  Gram-Schmidt  Procedure  to  the  basis  1,  x,  x2  to  produce  an 
orthonormal  basis  of  V2  (R). 

6  Find  an  orthonormal  basis  of  ^(R)  (with  inner  product  as  in  Exercise  5) 
such  that  the  differentiation  operator  (the  operator  that  takes  p  to  pr) 
on  V2(R)  has  an  upper-triangular  matrix  with  respect  to  this  basis. 

7  Find  a  polynomial  q  e  ^(R)  such  that 

p{^)  =  /  p{x)q{x)dx 

Jo 

for  every  p  e  ^(R)- 

8  Find  a  polynomial  q  e  P2(R)  such  that 

/  />(x)(cos  7rx)  dx  =  /  p{x)q(x)dx 

Jo  Jo 

for  every  p  e  ^(R)- 

9  What  happens  if  the  Gram-Schmidt  Procedure  is  applied  to  a  list  of 
vectors  that  is  not  linearly  independent? 
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10  Suppose  V  is  a  real  inner  product  space  and  v\ , . . . ,  vm  is  a  linearly  inde¬ 
pendent  list  of  vectors  in  V.  Prove  that  there  exist  exactly  2m  orthonormal 
lists  e\, ...  ,em  of  vectors  in  V  such  that 

span(vi, . . . ,  vj)  =  span(ei, . . . ,  ej) 
for  all  j  e  {1, . . . ,  m}. 

11  Suppose  (•,  -)i  and  (•,  -)2  are  inner  products  on  V  such  that  (v,  w)\  =  0 
if  and  only  if  (v,  w) 2  =  0.  Prove  that  there  is  a  positive  number  c  such 
that  (v,  w)  i  =  c(v,w)  2  for  every  v,w  e  V. 

12  Suppose  V  is  finite-dimensional  and  (•,  *)i,  (*,  -)2  are  inner  products  on 
V  with  corresponding  norms  ||  •  |[i  and  ||  •  || 2-  Prove  that  there  exists  a 
positive  number  c  such  that 


v 


<  c 


V 


2 


for  every  v  e  V. 


13  Suppose  vi , . . . ,  vm  is  a  linearly  independent  list  in  V.  Show  that  there 
exists  w  e  V  such  that  (w,  Vj)  >0  for  all  j  e  {1, . . . ,  m}. 

14  Suppose  e\, . . . ,  en  is  an  orthonormal  basis  of  V  and  v\, . . . ,  vn  are 
vectors  in  V  such  that 


eJ  ~  VJ 


< 


1 


for  each  j .  Prove  that  v\ , . . . ,  vn  is  a  basis  of  V. 


15  Suppose  Cr([—  1, 1])  is  the  vector  space  of  continuous  real- valued  func¬ 
tions  on  the  interval  [—1,  1]  with  inner  product  given  by 

(/,  g)  =  j  f(x)g(x )  dx 


for  /’  g  e  Cr([—  1,  1]).  Let  cp  be  the  linear  functional  on  Cr([— 1,  1]) 
defined  by  cp (/)  =  /(0).  Show  that  there  does  not  exist  g  e  Cr([— 1, 1]) 
such  that 

<p(f)  =  (f,g) 

for  every  /  G  Cr([— 1,  1]). 

[The  exercise  above  shows  that  the  Riesz  Representation  Theorem  (6.42) 
does  not  hold  on  infinite-dimensional  vector  spaces  without  additional 
hypotheses  on  V  and  <p.] 
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16  Suppose  F  =  C,  V  is  finite-dimensional,  T  g  C(V ),  all  the  eigenvalues 
of  T  have  absolute  value  less  than  1,  and  6  >  0.  Prove  that  there  exists  a 
positive  integer  m  such  that  ||  Tmv\\  <  c 1| v||  for  every  v  e  V. 

17  For  u  G  V,  let  denote  the  linear  functional  on  V  defined  by 

(<Fw)(v)  =  (v,  u) 


for  v  G  V. 

(a)  Show  that  if  F  =  R,  then  is  a  linear  map  from  V  to  V'.  (Recall 
from  Section  3.F  that  V '  —  C(V,  F)  and  that  V'  is  called  the  dual 
space  of  V.) 

(b)  Show  that  if  F  =  C  and  V  ^  {0},  then  is  not  a  linear  map. 

(c)  Show  that  is  injective. 

(d)  Suppose  F  =  R  and  V  is  finite-dimensional.  Use  parts  (a)  and  (c) 
and  a  dimension-counting  argument  (but  without  using  6.42)  to 
show  that  <I>  is  an  isomorphism  from  V  onto  V\ 

[Part  ( d )  gives  an  alternative  proof  of  the  Riesz  Representation  Theorem 
(6.42)  when  F  =  R.  Part  (d)  also  gives  a  natural  isomorphism  (meaning 
that  it  does  not  depend  on  a  choice  of  basis)  from  a  finite-dimensional 
real  inner  product  space  onto  its  dual  space.] 
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Orthogonal  Complements  and 
Minimization  Problems 

Orthogonal  Complements 

6.45  Definition  orthogonal  complement,  U1- 

If  U  is  a  subset  of  V,  then  the  orthogonal  complement  of  U,  denoted  U -1, 
is  the  set  of  all  vectors  in  V  that  are  orthogonal  to  every  vector  in  U : 

U _L  =  {v  G  V  :  (v,  u)  =  0  for  every  u  G  U}. 

For  example,  if  U  is  a  line  in  R3,  then  U1-  is  the  plane  containing  the 
origin  that  is  perpendicular  to  U.  If  U  is  a  plane  in  R3,  then  U1-  is  the  line 
containing  the  origin  that  is  perpendicular  to  U. 

6.46  Basic  properties  of  orthogonal  complement 

(a)  If  U  is  a  subset  of  V,  then  U1-  is  a  subspace  of  V. 

(b)  {oE  =  v. 

(c)  VL  =  {0}. 

(d)  If  U  is  a  subset  of  V,  then  U  fl  U1-  C  {0}. 

(e)  If  U  and  W  are  subsets  of  V  and  U  C  W,  then  W1-  C  t/V 

Proof 

(a)  Suppose  U  is  a  subset  of  V.  Then  (0,  u)  =  0  for  every  u  e  U;  thus 
0  €  U 1. 

Suppose  v,  w  e  U^~.  If  u  €  U,  then 

(v  +  w,u)  =  (v,  w)  +  (w,  u)  =0  +  0  =  0. 

Thus  v  +  wef/1.  In  other  words,  C/-1  is  closed  under  addition. 
Similarly,  suppose  A  e  F  and  v  e  U^.  If  u  e  U,  then 

(Av,  u)  =  A(v,  u)  =  A  •  0  =  0. 

Thus  Av  G  U ±.  In  other  words,  U1-  is  closed  under  scalar  multiplica¬ 
tion.  Thus  U1-  is  a  subspace  of  V. 


6.C 
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(b)  Suppose  v  e  V.  Then  (v,  0)  =  0,  which  implies  that  v  G  {0}-1.  Thus 

{oE  =  v. 

(c)  Suppose  v  G  V ±.  Then  (v,  v)  =  0,  which  implies  that  v  =  0.  Thus 

V±  =  {0}. 

(d)  Suppose  U  is  a  subset  of  V  and  v  G  U  Pi  U^.  Then  (v,  v)  =  0,  which 
implies  that  v  =  0.  Thus  U  Pi  U1-  C  {0}. 

(e)  Suppose  U  and  W  are  subsets  of  V  and  U  C  W.  Suppose  v  G  W ~L. 

Then  (v,u)  =  0  for  every  u  G  W,  which  implies  that  (v,u)  =  0  for 
every  u  G  U.  Hence  v  G  U J~.  Thus  W1-  C  U -L.  m 

Recall  that  if  U,  W  are  subspaces  of  V,  then  V  is  the  direct  sum  of  U  and 
W  (written  V  =  U  ©  W)  if  each  element  of  V  can  be  written  in  exactly  one 
way  as  a  vector  in  U  plus  a  vector  in  W  (see  1.40). 

The  next  result  shows  that  every  finite-dimensional  subspace  of  V  leads  to 
a  natural  direct  sum  decomposition  of  V. 

6.47  Direct  sum  of  a  subspace  and  its  orthogonal  complement 

Suppose  U  is  a  finite-dimensional  subspace  of  V.  Then 

V  =  u  ©  ux. 


Proof  First  we  will  show  that 

6.48  V  =  U  +  U±. 

To  do  this,  suppose  v  G  V.  Let  be  an  orthonormal  basis  of  U. 

Obviously 

6.49  v  =  < v,e\)e\  J\ - h  (v,em)em  +v  -  < v,e\)e\ - < v,em)em  . 

" - V - "  ' - V - ' 

u  w 

Let  u  and  w  be  defined  as  in  the  equation  above.  Clearly  u  G  U.  Because 
e\ , . . . ,  em  is  an  orthonormal  list,  for  each  j  =  1, . . . ,  m  we  have 

(w,  ej )  =  (v,  e} )  -  (v,  ej ) 

=  0. 

Thus  w  is  orthogonal  to  every  vector  in  span(^i, . . . ,  em).  In  other  words, 
w  G  U ±.  Thus  we  have  written  v  =  u  +  w,  where  u  G  U  and  w  G  U -1, 
completing  the  proof  of  6.48. 

From  6.46(d),  we  know  that  U  HU  ^  = 
that  V  =  U  0  U±  (see  1.45). 


{0}.  Along  with  6.48,  this  implies 


SECTION  6.C  Orthogonal  Complements  and  Minimization  Problems 


195 


Now  we  can  see  how  to  compute  dim  U1-  from  dim  U. 

6.50  Dimension  of  the  orthogonal  complement 
Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  Then 

dim  U±  =  dim  V  —  dim  U. 

Proof  The  formula  for  dim  U1-  follows  immediately  from  6.47  and  3.78.  ■ 
The  next  result  is  an  important  consequence  of  6.47. 

6.51  The  orthogonal  complement  of  the  orthogonal  complement 
Suppose  U  is  a  finite-dimensional  subspace  of  V.  Then 

u  =  c u ±)-L. 


Proof  First  we  will  show  that 

6.52  U  C  (U^. 

To  do  this,  suppose  u  g  U.  Then  (u,v)  =  0  for  every  v  G  U1-  (by  the 
definition  of  U ±).  Because  u  is  orthogonal  to  every  vector  in  U  we  have 
u  G  (f/-1)-1,  completing  the  proof  of  6.52. 

To  prove  the  inclusion  in  the  other  direction,  suppose  v  G  (f/-1)-1.  By 
6.47,  we  can  write  v  =  u  +  w,  where  u  G  U  and  w  G  U -L.  We  have 
v  —  u  =  w  G  U -L.  Because  v  G  ([/_L)_L  and  u  G  ([/_L)_L  (from  6.52),  we 
have  v  —  u  G  (f/-1)-1.  Thus  v  —  u  G  U1-  D  (f/-1)-1,  which  implies  that  v  —  u 
is  orthogonal  to  itself,  which  implies  that  v  —  u  —  0,  which  implies  that 
v  =  u,  which  implies  that  v  G  U.  Thus  (f/-1)-1  C  U,  which  along  with  6.52 
completes  the  proof.  ■ 

We  now  define  an  operator  Tjj  for  each  finite-dimensional  sub  space  of  V. 
6.53  Definition  orthogonal  projection,  Pjj 

Suppose  U  is  a  finite-dimensional  subspace  of  V.  The  orthogonal 
projection  of  V  onto  U  is  the  operator  Pjj  g  C(V)  defined  as  follows: 
For  v  G  V,  write  v  =  u  +  w,  where  u  G  U  and  w  G  U ±.  Then  Pjjv  =  u. 
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The  direct  sum  decomposition  V  =  U  ©  U1-  given  by  6.47  shows  that 
each  v  G  V  can  be  uniquely  written  in  the  form  v  =  u  +  w  with  u  G  U  and 
w  G  U -L.  Thus  Pjj  v  is  well  defined. 


6.54  Example  Suppose  x  e  V  with  x/0  and  U  =  span(x).  Show  that 


Puv  = 


(v,x) 


X 


X 


for  every  v  G  V. 

Solution  Suppose  v  e  V.  Then 

{v,x)  /  (v,x)  \ 

v  =  J^px  +  y'~u¥x>' 

where  the  first  term  on  the  right  is  in  span(x)  (and  thus  in  U)  and  the  second 
term  on  the  right  is  orthogonal  to  x  (and  thus  is  in  U^).  Thus  Pjjv  equals  the 
first  term  on  the  right,  as  desired. 


6.55  Properties  of  the  orthogonal  projection  Pu 


Suppose  U  is  a  finite-dimensional  subspace  of  V  and  v  e  V.  Then 

(a)  Pv  e  C(V)  ; 


(b)  Puu  =  u  for  every  u  e  U\ 


(d) 

(e) 

(f) 

(g) 

(h) 


(c)  Puw  =  0  for  every  w  €  U^\ 


range  I’u  =  U ; 
null  Pu  =  U ±- 
V  -  Puv  €  UL\ 

Pu 2  =  Pu\ 


Puv II  < 


V 


(i)  for  every  orthonormal  basis  e\ , . . . ,  em  of  U, 


PVV  =  {v,  e\)e\  H - b  (v,  em)em. 
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Proof 


(a)  To  show  that  Pjj  is  a  linear  map  on  V,  suppose  v\ ,  V2  6  V.  Write 

v i  =  u  i  +  w i  and  V2  =  U2  +  W2 

with  u\ ,  U2  £  U  and  wi,  W2  G  f/-1.  Thus  Pj/vi  =  u\  and  Pjjv 2  =  2^2- 
Now 

Vi  +  V2  =  (U\  +  U2)  +  (Wi  +  W2), 

where  wi  +  2/2  c  £7  and  w\  +  W2  G  f/-1.  Thus 

Pu(y\  +  v2)  =  20  +  1/2  =  Puvi  +  Puv2- 


Similarly,  suppose  X  e  F.  The  equation  v  =  w  +  w  with  u  e  U  and 
w  G  t/-1  implies  that  Av  =  Xu  +  Xw  with  Xu  G  U  and  Aw  G  U J~. 
Thus  Pjj( Av)  —  Xu  —  XPjjv. 

Hence  TV  is  a  linear  map  from  F  to  V. 

(b)  Suppose  u  G  U.  We  can  write  u  =  u  +  0,  where  u  e  U  and  0g  t/1. 
Thus  iVw  =  u. 

(c)  Suppose  w  G  U ±.  We  can  write  w  =  0  + w,  where  0  G  U  and  w  G  U ±. 
Thus  Pjj  w  =  0. 

(d)  The  definition  of  Pjj  implies  that  range  Pjj  C  U.  Part  (b)  implies  that 
U  C  range  Pjj.  Thus  range  Pjj  =  U. 

(e)  Part  (c)  implies  that  U1-  C  null  Pjj.  To  prove  the  inclusion  in  the  other 
direction,  note  that  if  v  G  null  Pjj  then  the  decomposition  given  by  6.47 
must  be  v  =  0  +  v,  where  0  G  U  and  v  G  U ±.  Thus  null  Pjj  C  U ±. 

(f)  If  v  =  u  +  w  with  u  G  U  and  w  G  then 

v  —  Pjjv  =  v  —  u  =  w  G  U^~ . 


(g) 

(h) 


If  v  =  u  +  w  with  u  G  U  and  w  G  U -1,  then 

(Pu2)v  =  Pjj  (Pu t)  =  Pjju  =  It  =  Pj/v. 


If  v  =  w  +  w  with  u  e  U  and  w  G  f/-1,  then 


Pt/vll  =  u 


< 


u  1  + 


w 


V 


(i) 


where  the  last  equality  comes  from  the  Pythagorean  Theorem. 

The  formula  for  Pjjv  follows  from  equation  6.49  in  the  proof  of  6.47.  ■ 
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Minimization  Problems 

The  following  problem  often  arises: 
given  a  subspace  U  of  V  and  a  point 
v  G  V,  find  a  point  u  G  U  such  that 
| v  —  u\\  is  as  small  as  possible.  The 
next  proposition  shows  that  this  mini¬ 
mization  problem  is  solved  by  taking 
u  =  Pjjv. 


The  remarkable  simplicity  of  the  so¬ 
lution  to  this  minimization  problem 
has  led  to  many  important  applica¬ 
tions  of  inner  product  spaces  out¬ 
side  of  pure  mathematics. 


6.56  Minimizing  the  distance  to  a  subspace 

Suppose  U  is  a  finite-dimensional  subspace  of  V,  v  G  V,  and  u  G  U.  Then 


v  —  Pjjv ||  < 


v  —  u 


Furthermore,  the  inequality  above  is  an  equality  if  and  only  if  u  =  Pzjv. 


Proof  We  have 


6.57 


V  —  Pjjv II 2  <  II V  —  FVv||2  +  ||  Pjjv  —  u  || 2 
=  1 1  ( v  Pjj  v)  +  (Pjjv-u)  II2 

=  || v  —  U  II2, 


where  the  first  line  above  holds  because  0  <  \\Pjjv  —  u  ||2,  the  second 
line  above  comes  from  the  Pythagorean  Theorem  [which  applies  because 
v  —  Pjjv  g  U1-  by  6.55(f),  and  Pjjv  —  ue  U],  and  the  third  line  above  holds 
by  simple  algebra.  Taking  square  roots  gives  the  desired  inequality. 

Our  inequality  above  is  an  equality  if  and  only  if  6.57  is  an  equality, 
which  happens  if  and  only  if  ||  Pjjv  —  u  ||  =0,  which  happens  if  and  only  if 
u  =  Pjjv.  m 


Pjjv  is  the  closest  point  in  U  to  v. 
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The  last  result  is  often  combined  with  the  formula  6.55(i)  to  compute 
explicit  solutions  to  minimization  problems. 


6.58  Example  Find  a  polynomial  u  with  real  coefficients  and  degree  at 
most  5  that  approximates  sinx  as  well  as  possible  on  the  interval  [—tv,  tv],  in 
the  sense  that 


sinx  —  u(x)  | 


2 


is  as  small  as  possible.  Compare  this  result  to  the  Taylor  series  approximation. 


Solution  Let  Cr  [—tv,  tv]  denote  the  real  inner  product  space  of  continuous 
real-valued  functions  on  [—n,  tv]  with  inner  product 

6.59  ( f,g}=[  f(x)g(x)dx. 

J  —71 

Let  v  G  Cr[— tv,  tv]  be  the  function  defined  by  v(x)  =  sinx.  Let  U  denote  the 
subspace  of  Cr[— tv,  tv]  consisting  of  the  polynomials  with  real  coefficients 
and  degree  at  most  5.  Our  problem  can  now  be  reformulated  as  follows: 


Find  u  G  U  such  that  II  v  —  u 


is  as  small  as  possible. 


To  compute  the  solution  to  our  ap¬ 
proximation  problem,  first  apply  the 
Gram-Schmidt  Procedure  (using  the  in¬ 
ner  product  given  by  6.59)  to  the  basis  1,  x,  x2,  x3,  x4,  x5  of  U,  producing 
an  orthonormal  basis  e\ ,  e-i,  £3,  £4,  £5,  £5  of  U.  Then,  again  using  the  inner 
product  given  by  6.59,  compute  Pjjv  using  6.55(i)  (with  m  =  6).  Doing  this 
computation  shows  that  Pjjv  is  the  function  u  defined  by 

6.60  u(x)  =  0.987862x  -  0.155271x3  +  0.00564312x5, 


A  computer  that  can  perform  inte¬ 
grations  is  useful  here. 


where  the  7r’s  that  appear  in  the  exact  answer  have  been  replaced  with  a  good 
decimal  approximation. 

By  6.56,  the  polynomial  u  above  is  the  best  approximation  to  sinx  on 
[—tv,  n]  using  polynomials  of  degree  at  most  5  (here  “best  approximation” 
means  in  the  sense  of  minimizing  |  sinx  —  u(x)\2  dx).  To  see  how  good 
this  approximation  is,  the  next  figure  shows  the  graphs  of  both  sinx  and  our 
approximation  u(x)  given  by  6.60  over  the  interval  [—jt,  n]. 
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Graphs  on  [—jt,  n]  of  sinx  (blue)  and 
its  approximation  u(x)  (red)  given  by  6.60. 


Our  approximation  6.60  is  so  accurate  that  the  two  graphs  are  almost 
identical — our  eyes  may  see  only  one  graph!  Here  the  blue  graph  is  placed 
almost  exactly  over  the  red  graph.  If  you  are  viewing  this  on  an  electronic 
device,  try  enlarging  the  picture  above,  especially  near  3  or  —3,  to  see  a  small 
gap  between  the  two  graphs. 

Another  well-known  approximation  to  sin  x  by  a  polynomial  of  degree  5 
is  given  by  the  Taylor  polynomial 


6.61 


+ 


x 


5 


To  see  how  good  this  approximation  is,  the  next  picture  shows  the  graphs  of 
both  sinx  and  the  Taylor  polynomial  6.61  over  the  interval  [— jt,  n]. 


Graphs  on  [—it,  jt]  of  sinx  (blue)  and  the  Taylor  polynomial  6.61  (red). 


The  Taylor  polynomial  is  an  excellent  approximation  to  sin  x  for  x  near  0. 
But  the  picture  above  shows  that  for  |x|  >  2,  the  Taylor  polynomial  is  not 
so  accurate,  especially  compared  to  6.60.  For  example,  taking  x  =  3,  our 
approximation  6.60  estimates  sin  3  with  an  error  of  about  0.001,  but  the  Taylor 
series  6.61  estimates  sin  3  with  an  error  of  about  0.4.  Thus  at  x  =  3,  the  error 
in  the  Taylor  series  is  hundreds  of  times  larger  than  the  error  given  by  6.60. 
Linear  algebra  has  helped  us  discover  an  approximation  to  sin  x  that  improves 
upon  what  we  learned  in  calculus ! 
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EXERCISES  6.C 


1  Suppose  v\, ...  ,vm  €  V.  Prove  that 

{vi,  •  •  • ,  vm}x  =  (span(vi, . . . ,  vm))  . 

2  Suppose  U  is  a  finite-dimensional  subspace  of  V.  Prove  that  U1-  =  {0} 
if  and  only  if  U  —  V. 

[Exercise  14(a)  shows  that  the  result  above  is  not  true  without  the  hy¬ 
pothesis  that  U  is  finite-dimensional.] 

3  Suppose  U  is  a  subspace  of  V  with  basis  u\, ...  ,um  and 


U  \ ,  .  .  .  ,  Ujfi,  W\ ,  .  .  .  ,  Wfi 

is  a  basis  of  V.  Prove  that  if  the  Gram-Schmidt  Procedure  is  applied 
to  the  basis  of  V  above,  producing  a  list  e\, . . .  ,em,  f\, . . . ,  fn,  then 
e\ , . . . ,  em  is  an  orthonormal  basis  of  U  and  f\ , . . . ,  fn  is  an  orthonor¬ 
mal  basis  of  U^. 

4  Suppose  U  is  the  subspace  of  R4  defined  by 

u  =  span((l,  2,  3,  —4),  (—5, 4,  3, 2)). 

Find  an  orthonormal  basis  of  U  and  an  orthonormal  basis  of  t/V 


5  Suppose  V  is  finite-dimensional  and  U  is  a  subspace  of  V.  Show  that 
Pjj. l  =  /  —  Pu,  where  I  is  the  identity  operator  on  V. 


6  Suppose  U  and  W  are  finite-dimensional  subspaces  of  V.  Prove  that 
Pu  Pw  —  0  if  and  only  if  (u,  w)  =0  for  all  u  e  U  and  all  w  e  W. 

7  Suppose  V  is  finite-dimensional  and  P  e  C(V)  is  such  that  P2  =  P  and 
every  vector  in  null  P  is  orthogonal  to  every  vector  in  range  P .  Prove 
that  there  exists  a  subspace  U  of  V  such  that  P  =  Pjj. 


8 


Suppose  V  is  finite-dimensional  and  P  e  C(V )  is  such  that  P2  =  P 
and 


Pvll  < 


V 


for  every  v  e  V.  Prove  that  there  exists  a  subspace  U  of  V  such  that 

P  =  Pu. 


9 


Suppose  T  G  C(V)  and  U  is  a  finite-dimensional  subspace  of  V.  Prove 
that  U  is  invariant  under  T  if  and  only  if  Pjj  T Pu  —  T Pu • 
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10  Suppose  V  is  finite-dimensional,  T  e  C(V),  and  U  is  a  subspace 
of  V.  Prove  that  U  and  U1-  are  both  invariant  under  T  if  and  only 
if  PjjT  =  TPjj. 

11  In  R4,  let 

U  =  span((l,  1,0,0),  (1,1,  1,2)). 

Find  u  e  U  such  that  \\u  —  (1, 2,  3,  4) ||  is  as  small  as  possible. 

12  Find  p  e  ^(R)  such  that  ;?(0)  =  0,  pf(0)  =  0,  and 


2  -\-  3x 


p{x) |2  dx 


is  as  small  as  possible. 


13  Find  p  e  V5  (R)  that  makes 


sinx  —  p{x) |  dx 


as  small  as  possible. 

[The  polynomial  6.60  is  an  excellent  approximation  to  the  answer  to  this 
exercise ,  but  here  you  are  asked  to  find  the  exact  solution ,  which  involves 
powers  ofix.  A  computer  that  can  perform  symbolic  integration  will  be 
useful.] 

14  Suppose  Cr([—  1, 1])  is  the  vector  space  of  continuous  real- valued  func¬ 
tions  on  the  interval  [—1,1]  with  inner  product  given  by 

(/•  g)  =  j  f(x)g(x )  dx 

for  /,  g  e  Cr([—  1, 1]).  Let  U  be  the  subspace  of  Cr([—  1, 1])  defined 
by 

U  =  {/  €  CR([-1, 1])  :  /(0)  =  0}. 

(a)  Show  that  U1-  =  {0}. 

(b)  Show  that  6.47  and  6.5 1  do  not  hold  without  the  finite-dimensional 
hypothesis. 
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Operators  on  Inner  Product 
Spaces 


The  deepest  results  related  to  inner  product  spaces  deal  with  the  subject 
to  which  we  now  turn — operators  on  inner  product  spaces.  By  exploiting 
properties  of  the  adjoint,  we  will  develop  a  detailed  description  of  several 
important  classes  of  operators  on  inner  product  spaces. 

A  new  assumption  for  this  chapter  is  listed  in  the  second  bullet  point  below: 

7.1  Notation  F,  V 

•  F  denotes  R  or  C. 

•  V  and  W  denote  finite-dimensional  inner  product  spaces  over  F. 


LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  adjoint 

■  Spectral  Theorem 

■  positive  operators 

■  isometries 

■  Polar  Decomposition 

■  Singular  Value  Decomposition 
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7.A 


Self-Adjoint  and  Normal  Operators 


Adjoints 


7.2  Definition  adjoint ,  T* 

Suppose  T  G  £(F,  IF).  The  adjoint  of  T  is  the  function  T*  :  IT  ->  V 
such  that 


(T v,  w)  =  (v,  T*w) 
for  every  v  g  V  and  every  w  e  W. 


To  see  why  the  definition  above 
makes  sense,  suppose  T  g  C{V,  W). 
Fix  w  G  W.  Consider  the  linear  func¬ 
tional  on  V  that  maps  v  G  V  to  (Tv,  w); 
this  linear  functional  depends  on  T  and 
w.  By  the  Riesz  Representation  Theo¬ 
rem  (6.42),  there  exists  a  unique  vector 
in  V  such  that  this  linear  functional  is 
given  by  taking  the  inner  product  with  it.  We  call  this  unique  vector  T*w.  In 
other  words,  T*w  is  the  unique  vector  in  V  such  that  (Tv,w)  =  (v,  T*w)  for 
every  v  e  V. 

7.3  Example  Define  T :  R3  R2  by 


The  word  adjoint  has  another 
meaning  in  linear  algebra.  We  do 
not  need  the  second  meaning  in 
this  book.  In  case  you  encounter 
the  second  meaning  for  adjoint 
elsewhere,  be  warned  that  the  two 
meanings  for  adjoint  are  unrelated 
to  each  other. 


T(x\,  X2,  x$)  =  (X2  +  3X3,  2xi). 

Find  a  formula  for  T*. 

Solution  Here  T*  will  be  a  function  from  R2  to  R3.  To  compute  T*  fix  a 
point  (y i ,  y2)  G  R2.  Then  for  every  (x\ ,  X2,  X3)  G  R3  we  have 

{(xi,x2, x3),  T*(y\,y2))  =  {T(x\, x2, x3),  (yi,  y2)) 

=  <(x2  +  3^3,2xi),  (ji,  j2)} 

=  x2ji  +  3x3yi  +  2xij2 
=  <(xi,x2,x3),  (2y2,  ji,3ji)>. 


Thus 


T*(yi,  y2)  =  (2y2,yi,3y1). 
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7.4  Example  Fix  u  e  V  and  x  e  W.  Define  T  e  C(V,  W)  by 

Tv  =  (v,  u)x 

for  every  v  e  F.  Find  a  formula  for  T*. 

Solution  Fix  w  e  W.  Then  for  every  v  e  F  we  have 

(v,  r*w)  =  (r  v,  w) 

=  ((v,u)x,w) 

=  {v,u){x,w) 

=  (v,  (w,  x)w). 

Thus 

r*w  =  (w,  x)w. 

In  the  two  examples  above,  T*  turned  out  to  be  not  just  a  function  but  a 
linear  map.  This  is  true  in  general,  as  shown  by  the  next  result. 

The  proofs  of  the  next  two  results  use  a  common  technique:  flip  T*  from 
one  side  of  an  inner  product  to  become  T  on  the  other  side. 

7.5  The  adjoint  is  a  linear  map 
If  T  e  C(V ,  W),  then  T*  e  C(W,  V ). 

Proof  Suppose  T  e  C(V,  W ).  Fix  w\,  W2  e  IF.  If  v  e  F,  then 

(v,  T*(wi  +  w2))  =  (Tv,w\  +  w2) 

=  (Tv,w\)  +  (T v,  w2) 

=  (v,r*Wi)  +  (v,  T*w2) 

=  (v,r*w!  +  r*w2), 

which  shows  that  T*(w\  +  w2)  =  T*wi  +  T*w2. 

Fix  w  e  IF  and  A  e  F.  If  v  e  F,  then 

(v,r*(Aw))  =  (Tv,  Aw) 

=  A  (Tv,  w) 

=  A  (v,T*w) 

=  (v,AT*w), 

which  shows  that  T*(Aw)  =  A T*w. 

Thus  T*  is  a  linear  map,  as  desired.  ■ 


206 


CHAPTER  7  Operators  on  Inner  Product  Spaces 


7.6  Properties  of  the  adjoint 

(a)  (5  +  7)*  =  5*  +  7*  for  all  SJe  £(F,  W)\ 

(b)  (XT)*  =  AT*  for  all  A  G  F  and  T  g  £(V,  IF); 

(c)  (T*)*  =  T  for  all  T  g  C(V,  IF); 

(d)  I  *  =  /,  where  I  is  the  identity  operator  on  F ; 

(e)  (57)*  =  7*5*  for  all  7  G  £(F,  IF)  and  5  G  C(W,  U )  (here  £/ 
is  an  inner  product  space  over  F). 

Proof 

(a)  Suppose  5,7  c  C(V,  W ).  If  v  G  F  and  w  G  IF,  then 

(v,  (5  +  7)*w)  =  ((5  +  7)v,  w) 

=  (5v,  w)  +  (7 v,  w) 

=  (v,  5*w)  +  (v,  7*w) 

=  (v,  5* w  +  T*w). 

Thus  (5  +  7)*w  =  5*w  +  T*w,  as  desired. 

(b)  Suppose  X  G  F  and  7  G  £(F,  IF).  If  v  G  F  and  w  G  IF,  then 

(v,  (AT)*w)  =  (ATv,  w)  =  A(7v,w)  =  A(v,T*w)  =  (v,AT*w). 
Thus  (A7)*w  =  A7*w,  as  desired. 

(c)  Suppose  7  g  £(F,  IF).  If  v  e  F  and  w  G  IF,  then 

(w,  (7*)*v)  =  (T*w,  v)  =  (v,  7*w)  =  (Tv,w)  =  (w,  7v). 
Thus  (7*)*v  =  Tv,  as  desired. 

(d)  If  v,  w  G  F,  then 

(v,I*u)  =  (Iv,u)  =  (v,u). 

Thus  7*w  =  w,  as  desired. 

(e)  Suppose  7  g  £(F,  IF)  and  5  G  £(IF,  C/).  If  v  G  F  and  w  G  [/,  then 

(v,  (57)*m)  =  (5Tv,m)  =  (7v,  5*w)  =  (v,T*(5*m)). 
Thus  (57)*w  =  7* (5*i/),  as  desired. 
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The  next  result  shows  the  relationship  between  the  null  space  and  the  range 
of  a  linear  map  and  its  adjoint.  The  symbol  used  in  the  proof  means  “if 
and  only  if”;  this  symbol  could  also  be  read  to  mean  “is  equivalent  to”. 

7.7  Null  space  and  range  of  r* 

Suppose  T  g  C(V,  W).  Then 

(a)  nulir*  =  (rangeT)-1; 

(b)  rangeT*  =  (nulir)-1; 

(c)  nulir  =  (range  r*)-1-; 

(d)  ranger  =  (nulir*)-1. 

Proof  We  begin  by  proving  (a).  Let  w  G  W.  Then 

w  G  null  r*  <^==^  r*w  =  0 

<^==^  (v,  r*w)  =  0  for  all  v  G  V 
<^==^  (r v,  w)  =0  for  all  v  G  V 

<^==^  w  G  (range  T)1- . 

Thus  null  r*  =  (range  r)-1,  proving  (a). 

If  we  take  the  orthogonal  complement  of  both  sides  of  (a),  we  get  (d), 
where  we  have  used  6.51.  Replacing  T  with  r*  in  (a)  gives  (c),  where  we 
have  used  7.6(c).  Finally,  replacing  T  with  r*  in  (d)  gives  (b).  ■ 

7.8  Definition  conjugate  transpose 

The  conjugate  transpose  of  an  m-by-n  matrix  is  the  n-by-m  matrix  ob¬ 
tained  by  interchanging  the  rows  and  columns  and  then  taking  the  complex 
conjugate  of  each  entry. 


7.9  Example 

The  conjugate  transpose  of  the  matrix 


(2  3  +  4/  7 

V  6  5  8 i 


is  the  matrix 


6 

5 


7 


-8/ 


//F  =  R,  then  the  conjugate  trans¬ 
pose  of  a  matrix  is  the  same  as  its 
transpose,  which  is  the  matrix  ob¬ 
tained  by  interchanging  the  rows 
and  columns. 
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The  adjoint  of  a  linear  map  does 
not  depend  on  a  choice  of  basis. 
This  explains  why  this  book  em¬ 
phasizes  adjoints  of  linear  maps 
instead  of  conjugate  transposes  of 
matrices. 


The  next  result  shows  how  to  com¬ 
pute  the  matrix  of  T*  from  the  matrix 
of  T. 

Caution:  Remember  that  the  result 
below  applies  only  when  we  are  dealing 
with  orthonormal  bases.  With  respect  to 
nonorthonormal  bases,  the  matrix  of  T* 
does  not  necessarily  equal  the  conjugate 
transpose  of  the  matrix  of  T. 


7.10  The  matrix  of  T* 

Let  T  G  C(V,  W).  Suppose  e\ , . . . ,  en  is  an  orthonormal  basis  of  V  and 
/i , . . . ,  fm  is  an  orthonormal  basis  of  W.  Then 

M(T*,  (fi, . . . ,  fm),  . .  ,en)) 

is  the  conjugate  transpose  of 

•M.{T ,  (&1,  •  •  •  ,  €n)i  (/l  >  •  •  •  »  fm))- 

Proof  In  this  proof,  we  will  write  A i(T)  instead  of  the  longer  expres¬ 
sion  M(T ,  (e\, . . . ,  en ),  (/i, . . . ,  fm ));  we  will  also  write  Ad(r*)  instead 
of  M(T*,  (/l, . . . ,  fm),  (ei, . .  .,€„)). 

Recall  that  we  obtain  the  kth  column  of  A4(T)  by  writing  Te^  as  a  linear 
combination  of  the  f  j ’s;  the  scalars  used  in  this  linear  combination  then 
become  the  kth  column  of  M(T).  Because  /i, . . . ,  fm  is  an  orthonormal 
basis  of  W,  we  know  how  to  write  Te^  as  a  linear  combination  of  the  f  j ’s 
(see  6.30): 

Tek  =  < Tek,fi)fi  H - h  ( Tek,fm)fm . 

Thus  the  entry  in  row  j ,  column  k ,  of  A i(T)  is  (Te^,  fj). 

Replacing  T  with  T*  and  interchanging  the  roles  played  by  the  c’s  and 
/’ s,  we  see  that  the  entry  in  row  /,  column  k ,  of  Ad(T*)  is  (T*/^,p7  ), 
which  equals  (/^,  Tej),  which  equals  (Tej ,  /^),  which  equals  the  complex 
conjugate  of  the  entry  in  row  k ,  column  j ,  of  M(T).  In  other  words,  A 4(T*) 
is  the  conjugate  transpose  of  M(T).  m 
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Self-Adjoint  Operators 

Now  we  switch  our  attention  to  operators  on  inner  product  spaces.  Thus 
instead  of  considering  linear  maps  from  V  to  W,  we  will  be  focusing  on  linear 
maps  from  V  to  V;  recall  that  such  linear  maps  are  called  operators. 

7.11  Definition  self-adjoint 

An  operator  T  g  C(V)  is  called  self-adjoint  if  T  —  T*.  In  other  words, 

T  G  C(V)  is  self-adjoint  if  and  only  if 

(T v,  w)  =  (v,  T w) 

for  all  v,  w  G  V. 


7.12  Example  Suppose  T  is  the  operator  on  F2  whose  matrix  (with  re¬ 
spect  to  the  standard  basis)  is 


2  b 

3  7 

Find  all  numbers  b  such  that  T  is  self-adjoint. 

Solution  The  operator  T  is  self-adjoint  if  and  only  if  b  =  3  (because 
M(T)  =  7W(T*)  if  and  only  if  b  =  3;  recall  that  J\A(T*)  is  the  conjugate 
transpose  of  M(T) — see  7.10). 

You  should  verify  that  the  sum  of  two  self-adjoint  operators  is  self-adjoint 
and  that  the  product  of  a  real  scalar  and  a  self-adjoint  operator  is  self-adjoint. 

A  good  analogy  to  keep  in  mind  (es-  Some  mathematicians  use  the  term 
pecially  when  F  =  C)  is  that  the  adjoint  Hermitian  instead  of  self-adjoint, 
on  C{V)  plays  a  role  similar  to  complex  honoring  French  mathematician 
conjugation  on  C.  A  complex  number  Charles  Hermite,  who  in  1873 pub- 
z  is  real  if  and  only  if  z  =  z;  thus  a  self-  Ushed  the  first  proof  that  e  is  not  a 

adjoint  operator  (T  =  T*)  is  analogous  zew  of  any  polynomial  with  integer 

,  .  coefficients. 

to  a  real  number.  - - - 

We  will  see  that  the  analogy  discussed  above  is  reflected  in  some  important 

properties  of  self-adjoint  operators,  beginning  with  eigenvalues  in  the  next 

result. 

If  F  =  R,  then  by  definition  every  eigenvalue  is  real,  so  the  next  result  is 
interesting  only  when  F  =  C. 


Some  mathematicians  use  the  term 
Hermitian  instead  of  self-adjoint, 
honoring  French  mathematician 
Charles  Hermite,  who  in  1873  pub¬ 
lished  the  first  proof  that  e  is  not  a 
zero  of  any  polynomial  with  integer 
coefficients. 
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7.13  Eigenvalues  of  self-adjoint  operators  are  real 
Every  eigenvalue  of  a  self-adjoint  operator  is  real. 


Proof  Suppose  T  is  a  self-adjoint  operator  on  V.  Let  A  be  an  eigenvalue  of 
T,  and  let  v  be  a  nonzero  vector  in  V  such  that  Tv  =  Xv.  Then 


X  || v|| 2  =  (Av,  v)  =  (T v,  v)  =  (v,  Tv)  =  (v,  Av)  =  A 


Thus  A  =  A,  which  means  that  A  is  real,  as  desired. 


The  next  result  is  false  for  real  inner  product  spaces.  As  an  example, 
consider  the  operator  T  G  £(R2)  that  is  a  counterclockwise  rotation  of  90° 
around  the  origin;  thus  T(x,y)  =  (— y ,  x).  Obviously  Tv  is  orthogonal  to  v 
for  every  veR2,  even  though  f  /  0. 


7.14  Over  C,  Tv  is  orthogonal  to  v  for  all  v  only  for  the  0  operator 
Suppose  V  is  a  complex  inner  product  space  and  T  g  C{V).  Suppose 

(T  v,  v)  =  0 


for  all  v  G  V.  Then  T  =  0. 


Proof  We  have 


(T  u ,  w)  = 


{T(u  +  w),  u  +  w)  —  (T(u  —  w),  u  —  w) 


( T(u  +  iw),  u  +  iw)  —  {T(u  —  iw ),  u  —  iw ) 

H - : - i 


for  all  u,  w  G  V,  as  can  be  verified  by  computing  the  right  side.  Note  that 
each  term  on  the  right  side  is  of  the  form  (Tv,  v)  for  appropriate  v  G  V.  Thus 
our  hypothesis  implies  that  ( Tu,w )  =  0  for  all  u,  w  G  V.  This  implies  that 
T  —  0  (take  w  =  Tu).  m 


The  next  result  provides  another  ex¬ 
ample  of  how  self-adjoint  opera¬ 
tors  behave  like  real  numbers. 

The  next  result  is  false  for  real  inner 
product  spaces,  as  shown  by  consider¬ 
ing  any  operator  on  a  real  inner  product 
space  that  is  not  self-adjoint. 
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7.15  Over  C,  {Tv,  v)  is  real  for  all  v  only  for  self-adjoint  operators 

Suppose  V  is  a  complex  inner  product  space  and  T  e  C{V).  Then  T  is 
self-adjoint  if  and  only  if 

(T v,  v)  e  R 


for  every  v  e  V. 


Proof  Let  v  e  V.  Then 

{Tv,v)  —  {Tv,v)  =  {Tv,  v)  —  (v,  Tv)  =  {Tv,  v)  —  {T*v,  v)  =  {(T  —  T*)v,  v). 

If  {Tv,  v)  e  R  for  every  v  e  V,  then  the  left  side  of  the  equation  above  equals 
0,  so  {(T  —  T*)v,  v)  =  0  for  every  v  e  V.  This  implies  that  T  —  T*  =  0  (by 
7.14).  Hence  T  is  self-adjoint. 

Conversely,  if  T  is  self-adjoint,  then  the  right  side  of  the  equation  above 
equals  0,  so  {Tv,  v)  =  {Tv,  v)  for  every  v  e  V.  This  implies  that  {Tv,v)  e  R 
for  every  v  e  V,  as  desired.  ■ 

On  a  real  inner  product  space  V,  a  nonzero  operator  T  might  satisfy 
{Tv,v)  =  0  for  all  v  e  V.  However,  the  next  result  shows  that  this  cannot 
happen  for  a  self-adjoint  operator. 

7.16  If  T  =  T*  and  {Tv,  v)  =  0  for  all  v,  then  T  —  0 
Suppose  T  is  a  self-adjoint  operator  on  V  such  that 

(T  v,  v)  =  0 


for  all  v  G  V.  Then  T  —  0. 


Proof  We  have  already  proved  this  (without  the  hypothesis  that  T  is  self- 
adjoint)  when  V  is  a  complex  inner  product  space  (see  7.14).  Thus  we  can 
assume  that  V  is  a  real  inner  product  space.  If  u,  w  e  V,  then 

(T(u  +  w),u  +  w)  -  {T(u  -w),u  -w) 

7,17  {Tu,w)  —  - - - ; 

this  is  proved  by  computing  the  right  side  using  the  equation 

{Tw,u)  =  {w,  T u)  =  ( Tu,w ), 

where  the  first  equality  holds  because  T  is  self-adjoint  and  the  second  equality 
holds  because  we  are  working  in  a  real  inner  product  space. 

Each  term  on  the  right  side  of  7.17  is  of  the  form  {Tv,v)  for  appropriate  v. 
Thus  (T u,w)  =0  for  all  u,w  e  V.  This  implies  that  T  —  0  (take  w  =  Tu).  m 
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Normal  Operators 


7.18  Definition  normal 


An  operator  on  an  inner  product  space  is  called  normal  if  it  com¬ 
mutes  with  its  adjoint. 


In  other  words,  T  e  C(V)  is  normal  if 


r ^  7  r ^  7  5{C  r j  r  r j  r 


* 


Obviously  every  self-adjoint  operator  is  normal,  because  if  T  is  self-adjoint 
then  T*  =  T. 


7.1 9  Example  Let  T  be  the  operator  on  F* 2 3  whose  matrix  (with  respect  to 
the  standard  basis)  is 

2  -3 

3  2 

Show  that  T  is  not  self-adjoint  and  that  T  is  normal. 


Solution  This  operator  is  not  self-adjoint  because  the  entry  in  row  2,  column  1 
(which  equals  3)  does  not  equal  the  complex  conjugate  of  the  entry  in  row  1, 
column  2  (which  equals  —3). 

The  matrix  0f  TT*  equals 


( 


2 

3 


-3 

2 


2  3 
-3  2 


which  equals 


13  0  \ 

0  13  J  ' 


Similarly,  the  matrix  of  T*T  equals 


/  2  3 

(-3  2 


2  -3 

3  2 


which  equals 


13  0  \ 

o  13)' 


Because  TT*  and  T*T  have  the  same  matrix,  we  see  that  TT*  =  T*T. 
Thus  T  is  normal. 


The  next  result  implies  that 
null  T  =  null  T*  for  every  normal 
operator  T. 


In  the  next  section  we  will  see  why 
normal  operators  are  worthy  of  special 
attention. 

The  next  result  provides  a  simple 
characterization  of  normal  operators. 
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7.20  T  is  normal  if  and  only  if  ||Tv|| 


r*v||  for  all  v 


An  operator  T  e  £(V)  is  normal  if  and  only  if 


ITv||  = 


for  all  v  g  V. 

Proof  Let  T  G  C(V).  We  will  prove  both  directions  of  this  result  at  the  same 
time.  Note  that 

T  is  normal  <<=>►  T*T  —  TT*  =0 

((rr-rr>,v)  =o  for  aiivG  v 

{T*Tv,v)  =  {TT*v,v)  forallv  G  V 
<^>  ||  T  v\\2  =  \\T*v\\2  for  all  ve  V, 

where  we  used  7.16  to  establish  the  second  equivalence  (note  that  the  operator 
T*T  —  TT*  is  self-adjoint).  The  equivalence  of  the  first  and  last  conditions 
above  gives  the  desired  result.  ■ 

Compare  the  next  corollary  to  Exercise  2.  That  exercise  states  that  the 
eigenvalues  of  the  adjoint  of  each  operator  are  equal  (as  a  set)  to  the  complex 
conjugates  of  the  eigenvalues  of  the  operator.  The  exercise  says  nothing 
about  eigenvectors,  because  an  operator  and  its  adjoint  may  have  different 
eigenvectors.  However,  the  next  corollary  implies  that  a  normal  operator  and 
its  adjoint  have  the  same  eigenvectors. 

7.21  For  T  normal,  T  and  T*  have  the  same  eigenvectors 

Suppose  T  G  C(V)  is  normal  and  v  G  V  is  an  eigenvector  of  T  with 
eigenvalue  A.  Then  v  is  also  an  eigenvector  of  T*  with  eigenvalue  A. 

Proof  Because  T  is  normal,  so  is  T  —  A/,  as  you  should  verify.  Using  7.20, 
we  have 


0=  ||(r-A/)y||  =  ||(r-A/)*v||  =  ||(T* -A/)v||. 


Hence  v  is  an  eigenvector  of  T*  with  eigenvalue  A,  as  desired. 


Because  every  self-adjoint  operator  is  normal,  the  next  result  applies  in 
particular  to  self-adjoint  operators. 
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7.22  Orthogonal  eigenvectors  for  normal  operators 

Suppose  T  G  C{V)  is  normal.  Then  eigenvectors  of  T  corresponding  to 
distinct  eigenvalues  are  orthogonal. 

Proof  Suppose  a,  /?  are  distinct  eigenvalues  of  T with  corresponding  eigen¬ 
vectors  u,  v.  Thus  Tu  =  au  and  Tv  =  /3v.  From  7.21  we  have  T*v  =  /3v. 
Thus 


( a  —  /3)(u,  v)  =  (au,  v)  —  ( u ,  /3v) 

=  (Tu,v)  -  (u,T*v) 

=  0. 

Because  a  ^  /3,  the  equation  above  implies  that  (u,  v)  =  0.  Thus  u  and  v  are 
orthogonal,  as  desired.  ■ 


EXERCISES  7. A 

1  Suppose  n  is  a  positive  integer.  Define  T  e  C(¥n)  by 

T(z\,...,Zn)  —  (0,  Z\ ,  .  .  .  ,  Zn—\\ 

Find  a  formula  for  T*(zi, . . . ,  zn). 

2  Suppose  T  e  C{V)  and  X  e  F.  Prove  that  X  is  an  eigenvalue  of  T  if  and 
only  if  X  is  an  eigenvalue  of  T*. 

3  Suppose  T  G  C(V)  and  U  is  a  subspace  of  V.  Prove  that  U  is  invariant 
under  T  if  and  only  if  U1 2 3 4 5-  is  invariant  under  T*. 

4  Suppose  T  e  C(V,W).  Prove  that 

(a)  T  is  injective  if  and  only  if  T*  is  surjective; 

(b)  T  is  surjective  if  and  only  if  T*  is  injective. 

5  Prove  that 

dim  null  T*  =  dim  null  T  +  dim  W  —  dim  V 
and 

dim  rangeT*  =  dim  range  T 
for  every  T  G  C{V,  W ). 
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6  Make  V2  (R)  into  an  inner  product  space  by  defining 

{p,q)  =  /  p(x)q(x)dx. 

Jo 

Define  T  G  /^(^(R))  by  T(ao  +  a\x  +  a2X2)  =  ^ix. 

(a)  Show  that  T  is  not  self-adjoint. 

(b)  The  matrix  of  T  with  respect  to  the  basis  (1,  x,  x2)  is 

/  0  0  0  \ 

0  10. 

\  0  0  0  / 

This  matrix  equals  its  conjugate  transpose,  even  though  T  is  not 
self-adjoint.  Explain  why  this  is  not  a  contradiction. 

7  Suppose  S,T  e  C(V)  are  self-adjoint.  Prove  that  NT  is  self-adjoint  if 
and  only  if  NT  =  TS. 

8  Suppose  V  is  a  real  inner  product  space.  Show  that  the  set  of  self-adjoint 
operators  on  V  is  a  subspace  of  C(V). 

9  Suppose  V  is  a  complex  inner  product  space  with  V  ^  {0}.  Show  that 
the  set  of  self-adjoint  operators  on  V  is  not  a  subspace  of  C{V). 

10  Suppose  dim  V  >  2.  Show  that  the  set  of  normal  operators  on  V  is  not  a 
subspace  of  C(V). 

11  Suppose  P  G  C(V)  is  such  that  P2  =  P .  Prove  that  there  is  a  subspace 
U  of  V  such  that  P  —  Pjj  if  and  only  if  P  is  self-adjoint. 

12  Suppose  that  T  is  a  normal  operator  on  V  and  that  3  and  4  are  eigenvalues 
of  T.  Prove  that  there  exists  a  vector  v  G  V  such  that  ||v||  =  V2  and 
lITvH  =5. 

13  Give  an  example  of  an  operator  T  G  C{ C4)  such  that  T  is  normal  but 
not  self-adjoint. 

14  Suppose  T  is  a  normal  operator  on  V.  Suppose  also  that  v,  w  G  V  satisfy 
the  equations 


T  v  =  3v\  Tw  =  4w. 


Show  that  ||  T(v  +  w)  ||  =  10. 
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15  Fix  u,x  G  V.  Define  T  G  C(V)  by 

Tv  =  (v,  u)x 

for  every  v  e  V. 

(a)  Suppose  F  =  R.  Prove  that  T  is  self-adjoint  if  and  only  if  u,  x  is 
linearly  dependent. 

(b)  Prove  that  T  is  normal  if  and  only  if  u,  x  is  linearly  dependent. 

16  Suppose  T  G  C(V )  is  normal.  Prove  that 

range  T  —  range  T*. 

17  Suppose  T  G  jC(V)  is  normal.  Prove  that 

null  Tk  =  null  T  and  range  Tk  —  range  T 
for  every  positive  integer  k. 

18  Prove  or  give  a  counterexample:  If  T  G  C(V)  and  there  exists  an  ortho¬ 
normal  basis  e\, . . . ,  en  of  V  such  that  \\Tej  ||  =  \\T*ej  ||  for  each  j , 
then  T  is  normal. 

19  Suppose  T  G  £(C3)  is  normal  and  7(1,  1,  1)  =  (2,2,2).  Suppose 
(z i ,  Z2,  z 3)  G  null  T.  Prove  that  z\  +  Z2  +  Z3  =  0. 

20  Suppose  T  e  C(V,  W)  and  F  =  R.  Let  <§>v  be  the  isomorphism  from  V 
onto  the  dual  space  V'  given  by  Exercise  17  in  Section  6.B,  and  let  <E>m 
be  the  corresponding  isomorphism  from  W  onto  W\  Show  that  if  O7  and 

are  usecl  to  identify  Land  W  with  V'  and  Wr,  then  7"*  is  identified 
with  the  dual  map  Tf.  More  precisely,  show  that  <&y  o  T*  =  T'  o  <$>w. 

21  Fix  a  positive  integer  n.  In  the  inner  product  space  of  continuous  real¬ 
valued  functions  on  [—n,  n]  with  inner  product 

( f,g)=[  f(x)g(x)dx, 

J—JT 

let 

V  =  span(l,  cosx,  cos2x, . . . ,  cos nx,  sinx,  sin2x, . . . ,  sinnx). 

(a)  Define  D  g  C{V)  by  Df  =  fr.  Show  that  D*  =  —D.  Conclude 
that  D  is  normal  but  not  self-adjoint. 

(b)  Define  T  G  C(V)  by  Tf  =  f".  Show  that  T  is  self-adjoint. 
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The  Spectral  Theorem 

Recall  that  a  diagonal  matrix  is  a  square  matrix  that  is  0  everywhere  except 
possibly  along  the  diagonal.  Recall  also  that  an  operator  on  V  has  a  diagonal 
matrix  with  respect  to  a  basis  if  and  only  if  the  basis  consists  of  eigenvectors 
of  the  operator  (see  5.41). 

The  nicest  operators  on  V  are  those  for  which  there  is  an  orthonormal 
basis  of  V  with  respect  to  which  the  operator  has  a  diagonal  matrix.  These 
are  precisely  the  operators  T  e  C(V)  such  that  there  is  an  orthonormal  basis 
of  V  consisting  of  eigenvectors  of  T.  Our  goal  in  this  section  is  to  prove  the 
Spectral  Theorem,  which  characterizes  these  operators  as  the  normal  operators 
when  F  =  C  and  as  the  self-adjoint  operators  when  F  =  R.  The  Spectral 
Theorem  is  probably  the  most  useful  tool  in  the  study  of  operators  on  inner 
product  spaces. 

Because  the  conclusion  of  the  Spectral  Theorem  depends  on  F,  we  will 
break  the  Spectral  Theorem  into  two  pieces,  called  the  Complex  Spectral 
Theorem  and  the  Real  Spectral  Theorem.  As  is  often  the  case  in  linear  algebra, 
complex  vector  spaces  are  easier  to  deal  with  than  real  vector  spaces.  Thus 
we  present  the  Complex  Spectral  Theorem  first. 


7.B 


The  Complex  Spectral  Theorem 

The  key  part  of  the  Complex  Spectral  Theorem  (7.24)  states  that  if  F  =  C 
and  T  e  C(V)  is  normal,  then  T  has  a  diagonal  matrix  with  respect  to  some 
orthonormal  basis  of  V.  The  next  example  illustrates  this  conclusion. 


7.23  Example  Consider  the  normal  operator  T  e  £(C2)  from  Example 
7.19,  whose  matrix  (with  respect  to  the  standard  basis)  is 


(U)  (-U) 


As  you  can  verify,  ^=- ,  v  is  an  orthonormal  basis  of  Cz  consisting  of 
eigenvectors  of  T,  and  with  respect  to  this  basis  the  matrix  of  T  is  the  diagonal 


matrix 

/  2  +  3/  0  \ 

V  o  2-3/  J* 


In  the  next  result,  the  equivalence  of  (b)  and  (c)  is  easy  (see  5.41).  Thus 
we  prove  only  that  (c)  implies  (a)  and  that  (a)  implies  (c). 
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7.24  Complex  Spectral  Theorem 

Suppose  F  =  C  and  T  e  C{V).  Then  the  following  are  equivalent: 

(a)  T  is  normal. 

(b)  V  has  an  orthonormal  basis  consisting  of  eigenvectors  of  T. 

(c)  T  has  a  diagonal  matrix  with  respect  to  some  orthonormal  basis 
of  V. 


Proof  First  suppose  (c)  holds,  so  T  has  a  diagonal  matrix  with  respect  to 
some  orthonormal  basis  of  V.  The  matrix  of  T*  (with  respect  to  the  same 
basis)  is  obtained  by  taking  the  conjugate  transpose  of  the  matrix  of  T ;  hence 
T*  also  has  a  diagonal  matrix.  Any  two  diagonal  matrices  commute;  thus  T 
commutes  with  T*  which  means  that  T  is  normal.  In  other  words,  (a)  holds. 

Now  suppose  (a)  holds,  so  T  is  normal.  By  Schur’s  Theorem  (6.38), 
there  is  an  orthonormal  basis  e\ , . . . ,  en  of  V  with  respect  to  which  T  has  an 
upper-triangular  matrix.  Thus  we  can  write 


7.25 


(  a  1,1 
0 


d\ ,n  ^ 
,n  / 


We  will  show  that  this  matrix  is  actually  a  diagonal  matrix. 
We  see  from  the  matrix  above  that 


Tei\\2  =  aiA 


and 


T*Pi  || Z  —  |^l,l  |Z  +  |^l,2 1 Z  +  *  *  *  +  Wl,n 


Because  T  is  normal,  \\Te\  ||  =  ||T*pi  ||  (see  7.20).  Thus  the  two  equations 
above  imply  that  all  entries  in  the  first  row  of  the  matrix  in  7.25,  except 
possibly  the  first  entry  a iq,  equal  0. 

Now  from  7.25  we  see  that 


\\Te2\\2  =  \a2,2\2 

(because  a  i?2  =  0,  as  we  showed  in  the  paragraph  above)  and 

||T*P2||2  =  |^2,2 12  +  1*72,3  |2  H - F  1*72, «|2. 

Because  T  is  normal,  ||7>2||  =  ||T*p2||.  Thus  the  two  equations  above  imply 
that  all  entries  in  the  second  row  of  the  matrix  in  7.25,  except  possibly  the 
diagonal  entry  a2?2,  equal  0. 

Continuing  in  this  fashion,  we  see  that  all  the  nondiagonal  entries  in  the 
matrix  7.25  equal  0.  Thus  (c)  holds.  ■ 
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The  Real  Spectral  Theorem 


This  technique  of  completing  the 
square  can  be  used  to  derive  the 
quadratic  formula. 


We  will  need  a  few  preliminary  results,  which  apply  to  both  real  and  complex 
inner  product  spaces,  for  our  proof  of  the  Real  Spectral  Theorem. 

You  could  guess  that  the  next  result 
is  true  and  even  discover  its  proof  by 
thinking  about  quadratic  polynomials 
with  real  coefficients.  Specifically,  sup¬ 
pose  b,c  e  R  and  b 2  <  4c.  Let  x  be  a 
real  number.  Then 

x2  T  bx  +  c  =  ^x  T-  — ^  T-  ^c  —  — ^  >  0. 


In  particular,  x2  +  bx  +  c  is  an  invertible  real  number  (a  convoluted  way 
of  saying  that  it  is  not  0).  Replacing  the  real  number  x  with  a  self-adjoint 
operator  (recall  the  analogy  between  real  numbers  and  self-adjoint  operators), 
we  are  led  to  the  result  below. 


7.26  Invertible  quadratic  expressions 


Suppose  T  G  C(V )  is  self-adjoint  and  b,  c  G  R  are  such  that  b 2 
Then 


T2  +  bT  +  cl 


<  4 c. 


is  invertible. 


Proof  Let  v  be  a  nonzero  vector  in  V.  Then 


(( T 2  +  bT  +  c/)v,  v)  =  (T2v,  v)  +  b(T  v,  v)  +  c(v,  v) 

=  (Tv,  Tv)  +  b (Tv,  v)  +  c||v||2 
>\\Tv\\2-\b\\\Tv\\\\v\\+c\\v\\2 

V 


=  (l  Tv 
>  0, 


ggV) 


V 


where  the  third  line  above  holds  by  the  Cauchy-Schwarz  Inequality  (6.15). 
The  last  inequality  implies  that  ( T 2  +  bT  +  c/)v^0.  Thus  T2  +  bT  +  cl 
is  injective,  which  implies  that  it  is  invertible  (see  3.69).  ■ 


We  know  that  every  operator,  self-adjoint  or  not,  on  a  finite-dimensional 
nonzero  complex  vector  space  has  an  eigenvalue  (see  5.21).  Thus  the  next 
result  tells  us  something  new  only  for  real  inner  product  spaces. 
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7.27  Self-adjoint  operators  have  eigenvalues 

Suppose  V  7^  {0}  and  T  e  C(V)  is  a  self-adjoint  operator.  Then  T  has 
an  eigenvalue. 

Proof  We  can  assume  that  V  is  a  real  inner  product  space,  as  we  have  already 
noted.  Let  n  —  dim  V  and  choose  v  e  V  with  v/0.  Then 

v,  T v,  T2v,  . . . ,  Tnv 

cannot  be  linearly  independent,  because  V  has  dimension  n  and  we  have  n  +  1 
vectors.  Thus  there  exist  real  numbers  do, . . . ,  an,  not  all  0,  such  that 


0  =  aov  +  a\Tv  +  •  •  •  +  anTnv. 

Make  the  a' s  the  coefficients  of  a  polynomial,  which  can  be  written  in  factored 
form  (see  4.17)  as 


d()  -p  d\X  +  •  •  •  +  dyiX1^ 

=  c(x  +  b\x  +  Cl)  •  •  •  (x  +  b M X  +  cm)(x  ~  Ai)  •  •  •  (x  -  Xm), 

where  c  is  a  nonzero  real  number,  each  bj ,  Cj,  and  Xj  is  real,  each  bj 2  is  less 
than  4c j  ,m  +  M  >  1,  and  the  equation  holds  for  all  real  x.  We  then  have 


0  =  aov  +  a\Tv  +  •  •  •  +  anTnv 
=  {do  I  +  d\T  +  •  •  •  +  dn  Tn)v 

=  c(T 2  +  bxT  +  ClI)-"(T2  +  bMT  +  cmI)(T  -  XxI)  •  •  •  (T  -  X  mI)v. 

By  7.26,  each  T 2  +  bjT  +  Cjl  is  invertible.  Recall  also  that  c  /  0.  Thus 
the  equation  above  implies  that  m  >  0  and 


0  =  (T-Ai/)...(r-Am/)v. 


Hence  T  —  Xj  I  is  not  injective  for  at  least  one  j .  In  other  words,  T  has  an 
eigenvalue.  ■ 

The  next  result  shows  that  if  U  is  a  subspace  of  V  that  is  invariant  under 
a  self-adjoint  operator  T,  then  U1-  is  also  invariant  under  T.  Later  we  will 
show  that  the  hypothesis  that  T  is  self-adjoint  can  be  replaced  with  the  weaker 
hypothesis  that  T  is  normal  (see  9.30). 
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7.28  Self-adjoint  operators  and  invariant  subspaces 

Suppose  T  e  C(V)  is  self-adjoint  and  U  is  a  subspace  of  V  that  is 
invariant  under  T.  Then 


(a)  U1-  is  invariant  under  T ; 


(b)  T\u  e  C(U)  is  self-adjoint; 


(c)  T\u±  e  C{U -1)  is  self-adjoint. 


Proof  To  prove  (a),  suppose  v  e  U ±.  Let  u  e  U.  Then 

(T v,  u)  =  (v,  Tu)  =  0, 

where  the  first  equality  above  holds  because  T  is  self-adjoint  and  the  second 
equality  above  holds  because  U  is  invariant  under  T  (and  hence  Tu  e  U) 
and  because  v  e  U ±.  Because  the  equation  above  holds  for  each  u  e  U,  we 
conclude  that  Tv  e  U^.  Thus  U1-  is  invariant  under  T,  completing  the  proof 
of  (a). 

To  prove  (b),  note  that  if  u,  v  e  U,  then 

((T\u)u,v)  =  (Tu,  v)  =  (u,Tv)  =  (u,(T\u)v). 

Thus  T\u  is  self-adjoint. 

Now  (c)  follows  from  replacing  U  with  U1-  in  (b),  which  makes  sense 
by  (a).  ■ 

We  can  now  prove  the  next  result,  which  is  one  of  the  major  theorems  in 
linear  algebra. 


7.29  Real  Spectral  Theorem 

Suppose  F  =  R  and  T  e  C{V).  Then  the  following  are  equivalent: 

(a)  T  is  self-adjoint. 

(b)  V  has  an  orthonormal  basis  consisting  of  eigenvectors  of  T. 


(c)  T  has  a  diagonal  matrix  with  respect  to  some  orthonormal  basis 
of  V. 
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Proof  First  suppose  (c)  holds,  so  T  has  a  diagonal  matrix  with  respect  to 
some  orthonormal  basis  of  V.  A  diagonal  matrix  equals  its  transpose.  Hence 
T  =  r*  and  thus  T  is  self-adjoint.  In  other  words,  (a)  holds. 

We  will  prove  that  (a)  implies  (b)  by  induction  on  dim  V.  To  get  started, 
note  that  if  dim  V  =  1,  then  (a)  implies  (b).  Now  assume  that  dim  V  >  1  and 
that  (a)  implies  (b)  for  all  real  inner  product  spaces  of  smaller  dimension. 

Suppose  (a)  holds,  so  T  e  C(V)  is  self-adjoint.  Let  u  be  an  eigenvector 
of  T  with  \\u\\  =  1  (7.27  guarantees  that  T  has  an  eigenvector,  which  can 
then  be  divided  by  its  norm  to  produce  an  eigenvector  with  norm  1).  Let 
U  =  span (u).  Then  U  is  a  1 -dimensional  subspace  of  V  that  is  invariant 
under  T.  By  7.28(c),  the  operator  T\jj±  e  CiU^)  is  self-adjoint. 

By  our  induction  hypothesis,  there  is  an  orthonormal  basis  of  U1-  consist¬ 
ing  of  eigenvectors  of  T\u±.  Adjoining  u  to  this  orthonormal  basis  of  U1- 
gives  an  orthonormal  basis  of  V  consisting  of  eigenvectors  of  T ',  completing 
the  proof  that  (a)  implies  (b). 

We  have  proved  that  (c)  implies  (a)  and  that  (a)  implies  (b).  Clearly  (b) 
implies  (c),  completing  the  proof.  ■ 


7.30  Example  Consider  the  self-adjoint  operator  T  on  R3  whose  matrix 
(with  respect  to  the  standard  basis)  is 


/  14  -13 

-13  14 

\  8  8 


8 

8 

-7 


As  you  can  verify, 

(1,— 1,0)  (1,1,1)  (1,  1,-2) 
sfl  ’  V3  ’  V6 

is  an  orthonormal  basis  of  R3  consisting  of  eigenvectors  of  T,  and  with  respect 
to  this  basis,  the  matrix  of  T  is  the  diagonal  matrix 


/  27  0  0  \ 

0  9  0  . 

\  0  0  -15  / 


If  F  =  C,  then  the  Complex  Spectral  Theorem  gives  a  complete  descrip¬ 
tion  of  the  normal  operators  on  V.  A  complete  description  of  the  self-adjoint 
operators  on  V  then  easily  follows  (they  are  the  normal  operators  on  V  whose 
eigenvalues  all  are  real;  see  Exercise  6). 

If  F  =  R,  then  the  Real  Spectral  Theorem  gives  a  complete  description 
of  the  self-adjoint  operators  on  V.  In  Chapter  9,  we  will  give  a  complete 
description  of  the  normal  operators  on  V  (see  9.34). 
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EXERCISES  7.B 


1  True  or  false  (and  give  a  proof  of  your  answer):  There  exists  T  e  £(R3) 
such  that  T  is  not  self-adjoint  (with  respect  to  the  usual  inner  product) 
and  such  that  there  is  a  basis  of  R3  consisting  of  eigenvectors  of  T. 

2  Suppose  that  T  is  a  self-adjoint  operator  on  a  finite-dimensional  inner 
product  space  and  that  2  and  3  are  the  only  eigenvalues  of  T.  Prove  that 
T2-5T  +  61  =  0. 

3  Give  an  example  of  an  operator  T  e  £(C3)  such  that  2  and  3  are  the 
only  eigenvalues  of  T  and  T2  —  5T  +  61  ^  0. 

4  Suppose  F  =  C  and  T  e  C(V).  Prove  that  T  is  normal  if  and  only  if 
all  pairs  of  eigenvectors  corresponding  to  distinct  eigenvalues  of  T  are 
orthogonal  and 

V  =  E(X i,  T )  ©  •  •  •  ©  E(Xm,  T), 

where  X\ , . . . ,  Xm  denote  the  distinct  eigenvalues  of  T. 

5  Suppose  F  =  R  and  T  e  jC(V).  Prove  that  T  is  self-adjoint  if  and  only 
if  all  pairs  of  eigenvectors  corresponding  to  distinct  eigenvalues  of  T  are 
orthogonal  and 

V  =  E(X\,  T )  ©  •  •  •  ©  E(Xm,  T), 

where  X\, ...  ,Xm  denote  the  distinct  eigenvalues  of  T. 

6  Prove  that  a  normal  operator  on  a  complex  inner  product  space  is  self- 
adjoint  if  and  only  if  all  its  eigenvalues  are  real. 

[ The  exercise  above  strengthens  the  analogy  (for  normal  operators) 
between  self-adjoint  operators  and  real  numbers .] 

7  Suppose  V  is  a  complex  inner  product  space  and  T  e  C(V)  is  a  normal 
operator  such  that  T9  =  Ts.  Prove  that  T  is  self-adjoint  and  T 2  =  T. 

8  Give  an  example  of  an  operator  T  on  a  complex  vector  space  such  that 
T9  =  T 8  but  T2  ^  T. 

9  Suppose  V  is  a  complex  inner  product  space.  Prove  that  every  normal 
operator  on  V  has  a  square  root.  (An  operator  S  e  E{V)  is  called  a 
square  root  of  T  e  C(V)  if  S 2  =  T.) 
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10  Give  an  example  of  a  real  inner  product  space  V  and  T  e  CfV)  and  real 
numbers  b ,  c  with  b 2  <  4 c  such  that  T 2  +  bT  +  cl  is  not  invertible. 
[The  exercise  above  shows  that  the  hypothesis  that  T  is  self-adjoint  is 
needed  in  7.26 ,  even  for  real  vector  spaces .] 

11  Prove  or  give  a  counterexample:  every  self-adjoint  operator  on  V  has  a 
cube  root.  (An  operator  S  e  C(V)  is  called  a  cube  root  ofT  e  C(V)  if 
S3  =  T.) 

12  Suppose  T  e  C(V)  is  self-adjoint,  A  e  F,  and  e  >  0.  Suppose  there 
exists  v  G  V  such  that  ||v||  =  1  and 

| Tv  —  Av ||  <  c. 

Prove  that  T  has  an  eigenvalue  X'  such  that  |A  —  A'|  <6. 

13  Give  an  alternative  proof  of  the  Complex  Spectral  Theorem  that  avoids 
Schur’s  Theorem  and  instead  follows  the  pattern  of  the  proof  of  the  Real 
Spectral  Theorem. 

14  Suppose  U  is  a  finite-dimensional  real  vector  space  and  T  e  C(U). 
Prove  that  U  has  a  basis  consisting  of  eigenvectors  of  T  if  and  only  if 
there  is  an  inner  product  on  U  that  makes  T  into  a  self-adjoint  operator. 

15  Find  the  matrix  entry  below  that  is  covered  up. 


keep  away  from  him... 
he’s  not  normal! 


1  1  2 
111 
111 
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Positive  Operators  and  Isometries 

Positive  Operators 

7.31  Definition  positive  operator 

An  operator  T  e  C(V)  is  called  positive  if  T  is  self-adjoint  and 

(7v,v)  >  0 

for  all  v  e  V. 

If  V  is  a  complex  vector  space,  then  the  requirement  that  T  is  self-adjoint 
can  be  dropped  from  the  definition  above  (by  7.15). 

7.32  Example  positive  operators 

(a)  If  U  is  a  subspace  of  V,  then  the  orthogonal  projection  Pjj  is  a  positive 
operator,  as  you  should  verify. 

(b)  If  T  g  C(V)  is  self-adjoint  and  b,  c  e  R  are  such  that  b2  <  4c, then 
T2  +  bT  +  cl  is  a  positive  operator,  as  shown  by  the  proof  of  7.26. 


7.C 


7.33  Definition  square  root 

An  operator  R  is  called  a  square  root  of  an  operator  T  if  R 2  =  T. 


7.34  Example  If  T  e  £(F3)  is  defined  by  T(z\,Z2,  z$)  =  (z3,0,  0), 
then  the  operator  R  e  £(F3)  defined  by  R(z\,  Z2,  z^,)  =  (z2,Z3,0)  is  a 
square  root  of  T. 


The  characterizations  of  the  positive  p/ie  pOSipve  operators  correspond 
operators  in  the  next  result  correspond  to  the  numbers  [0,  oc),  so  better 
to  characterizations  of  the  nonnegative  terminology  would  use  the  term 
numbers  among  C.  Specifically,  a  com-  nonnegative  instead  of  positive. 
plex  number  z  is  nonnegative  if  and  However,  operator  theorists  consis- 

only  if  it  has  a  nonnegative  square  root,  ^  cal1  these  the  Positive  0Penh 
..  ,  t.  .  .  x  A1  tors,  so  we  will  follow  that  custom. 

corresponding  to  condition  (c).  Also,  - 1 - 

z  is  nonnegative  if  and  only  if  it  has  a  real  square  root,  corresponding  to 

condition  (d).  Finally,  z  is  nonnegative  if  and  only  if  there  exists  a  complex 

number  w  such  that  z  =  ww,  corresponding  to  condition  (e). 


The  positive  operators  correspond 
to  the  numbers  [0,  oo),  so  better 
terminology  would  use  the  term 
nonnegative  instead  of  positive. 
However,  operator  theorists  consis¬ 
tently  call  these  the  positive  opera¬ 
tors,  so  we  will  follow  that  custom. 
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7.35  Characterization  of  positive  operators 

Let  T  e  C(V).  Then  the  following  are  equivalent: 

(a)  T  is  positive; 

(b)  T  is  self-adjoint  and  all  the  eigenvalues  of  T  are  nonnegative; 

(c)  T  has  a  positive  square  root; 

(d)  T  has  a  self-adjoint  square  root; 

(e)  there  exists  an  operator  R  e  C(V)  such  that  T  =  R*R. 

Proof  We  will  prove  that  (a)  (b)  =>►  (c)  (d)  =>>  (e)  =>  (a). 

First  suppose  (a)  holds,  so  that  T  is  positive.  Obviously  T  is  self-adjoint 
(by  the  definition  of  a  positive  operator).  To  prove  the  other  condition  in  (b), 
suppose  A  is  an  eigenvalue  of  T.  Let  v  be  an  eigenvector  of  T  corresponding 
to  A.  Then 

0  <  (Tv,  v)  =  (Av,  v)  =  A(v,  v). 

Thus  A  is  a  nonnegative  number.  Hence  (b)  holds. 

Now  suppose  (b)  holds,  so  that  T  is  self-adjoint  and  all  the  eigenvalues 
of  T  are  nonnegative.  By  the  Spectral  Theorem  (7.24  and  7.29),  there  is 
an  orthonormal  basis  e\, . . . ,  en  of  V  consisting  of  eigenvectors  of  T.  Let 
X\, ...  ,Xn  be  the  eigenvalues  of  T  corresponding  to  e\ , . . . ,  en,\  thus  each 
A  j  is  a  nonnegative  number.  Let  R  be  the  linear  map  from  V  to  V  such  that 

^ej  —  y[^jej 

for  j  =  1, . . . ,  n  (see  3.5).  Then  R  is  a  positive  operator,  as  you  should  verify. 
Furthermore,  R2ej  =  Xjej  =  Tej  for  each  j ,  which  implies  that  R2  =  T. 
Thus  R  is  a  positive  square  root  of  T.  Hence  (c)  holds. 

Clearly  (c)  implies  (d)  (because,  by  definition,  every  positive  operator  is 
self-adjoint). 

Now  suppose  (d)  holds,  meaning  that  there  exists  a  self-adjoint  operator 
R  on  V  such  that  T  =  R2.  Then  T  —  R* R  (because  R*  —  R).  Hence  (e) 
holds. 

Finally,  suppose  (e)  holds.  Let  R  e  C(V)  be  such  that  T  =  /?*/?.  Then 
T*  =  (/?*/?)*  =  £*(£*)*  =  R*R  =  T.  Hence  T  is  self-adjoint.  To 
complete  the  proof  that  (a)  holds,  note  that 

(T v,  v)  =  (R*Rv,v)  =  (Rv,Rv)  >0 

for  every  v  e  V.  Thus  T  is  positive.  ■ 
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Each  nonnegative  number  has  a 
unique  nonnegative  square  root.  The 
next  result  shows  that  positive  operators 
enjoy  a  similar  property. 


Some  mathematicians  also  use  the 
term  positive  semidefinite  opera - 
tor,  which  means  the  same  as  posi¬ 
tive  operator. 


7.36  Each  positive  operator  has  only  one  positive  square  root 
Every  positive  operator  on  V  has  a  unique  positive  square  root. 


Proof  Suppose  T  E  C(V)  is  positive.  ^  positive  operator  can  have  in- 
Suppose  v  E  V  is  an  eigenvector  of  T.  finitely  many  square  roots  (al- 
Thus  there  exists  X  >  0  such  that  Tv  —  though  only  one  of  them  can  be 
Xv.  positive).  For  example,  the  identity 

Let  R  be  a  positive  square  root  of  T.  operator  on  V  has  infinitely  many 
We  will  prove  that  Rv  =  VXv.  This  \ -^re  wots  if  dim  V  >  I. 

will  imply  that  the  behavior  of  R  on  the  eigenvectors  of  T  is  uniquely  deter¬ 
mined.  Because  there  is  a  basis  of  V  consisting  of  eigenvectors  of  T  (by  the 
Spectral  Theorem),  this  will  imply  that  R  is  uniquely  determined. 

To  prove  that  Rv  =  VXv,  note  that  the  Spectral  Theorem  asserts  that 
there  is  an  orthonormal  basis  e\, ...  ,en  of  V  consisting  of  eigenvectors  of  R. 
Because  R  is  a  positive  operator,  all  its  eigenvalues  are  nonnegative.  Thus 
there  exist  nonnegative  numbers  X\, ...  ,Xn  such  that  Re j  =  \[X]ej  for 
j  1 , . . . ,  n . 

Because  e\ , . . . ,  en  is  a  basis  of  V,  we  can  write 


A  positive  operator  can  have  in¬ 
finitely  many  square  roots  (al¬ 
though  only  one  of  them  can  be 
positive ).  For  example,  the  identity 
operator  on  V  has  infinitely  many 
square  roots  if  dim  V  >  1. 


v  —  a \e\  anen 

for  some  numbers  a\ , . . . ,  an  E  F.  Thus 

Rv  =  a\  \JX\e\  +  •  •  •  +  an  s/ Xnen 

and  hence 

R  v  =  a\X\e\  H - b  anXnen . 

But  R2  =  T,  and  Tv  =  Xv.  Thus  the  equation  above  implies 

a\Xe\  +  •••  +  anXen  =  a\X\e\  +  •••  +  anXnen 


The  equation  above  implies  that  aj  (A  —  Xj)  =  0  for  j  —  1 

lJej> 


•  •  • 


v  =  a  i  e 


and  thus 


Rv  =  aj  VXej  =  VXv, 


{j:Xj=X} 


n.  Hence 


as  desired. 
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Isometries 

Operators  that  preserve  norms  are  sufficiently  important  to  deserve  a  name: 


7.37  Definition  isometry 

•  An  operator  S  e  C(V )  is  called  an  isometry  if 


||Sv||  = 


V 


for  all  v  e  V. 

•  In  other  words,  an  operator  is  an  isometry  if  it  preserves  norms. 


The  Greek  word  isos  means  equal; 
the  Greek  word  metron  means 
measure.  Thus  isometry  literally 
means  equal  measure. 


For  example,  XI  is  an  isometry 
whenever  A  e  F  satisfies  |A|  =  1.  We 
will  see  soon  that  if  F  =  C,  then  the 
next  example  includes  all  isometries. 


7.38  Example  Suppose  X\, ...  ,Xn  are  scalars  with  absolute  value  1  and 
S  e  C(V)  satisfies  Sej  =  Xjej  for  some  orthonormal  basis  e\, . . . ,  en  of  V. 
Show  that  S  is  an  isometry. 


Solution  Suppose  v  e  V.  Then 


7.39 

and 

7.40 


v  =  (v,e\)ei  H - b  {v,en)en 


I(v,pi) 


+  *  *  *  +  |  (v,  en ) 


2 


where  we  have  used  6.30.  Applying  S  to  both  sides  of  7.39  gives 


Sv  =  (v,e\)Se\  H - b  {v,en)Sen 

=  Xi(v,ei)ei  +  •  •  •  +  Xn (v,  en)en. 


The  last  equation,  along  with  the  equation  |Ay  |  =  1,  shows  that 


7.41 


(v,pi> 


Comparing  7.40  and  7.41  shows  that 
isometry. 


||  S' v || .  In  other  words,  S  is  an 
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The  next  result  provides  several  con¬ 
ditions  that  are  equivalent  to  being  an 
isometry.  The  equivalence  of  (a)  and  (b) 
shows  that  an  operator  is  an  isometry  if 
and  only  if  it  preserves  inner  products. 
The  equivalence  of  (a)  and  (c)  [or  (d)] 
shows  that  an  operator  is  an  isometry 
if  and  only  if  the  list  of  columns  of  its 
matrix  with  respect  to  every  [or  some 
implies  that  in  the  previous  sentence  we 


An  isometry  on  a  real  inner  product 
space  is  often  called  an  orthogonal 
operator.  An  isometry  on  a  com¬ 
plex  inner  product  space  is  often 
called  a  unitary  operator.  We  use 
the  term  isometry  so  that  our  re¬ 
sults  can  apply  to  both  real  and 
complex  inner  product  spaces. 

basis  is  orthonormal.  Exercise  10 


can  replace  “columns”  with  “rows”. 


7.42  Characterization  of  isometries 

Suppose  5  e  C{V).  Then  the  following  are  equivalent: 

(a)  5  is  an  isometry; 

(b)  ( Su ,  Sv)  =  ( u ,  v)  for  all  u,  v  e  V; 

(c)  Se\ , . . . ,  Sen  is  orthonormal  for  every  orthonormal  list  of  vectors 
ei,  •  •  •  ,en  in  V; 

(d)  there  exists  an  orthonormal  basis  e\,...,en  of  V  such  that 
Se\, . . . ,  Sen  is  orthonormal; 

(e)  5*5  =  /; 

(f)  55*  =  /; 

(g)  5*  is  an  isometry; 

(h)  5  is  invertible  and  5_1  =  5*. 

Proof  First  suppose  (a)  holds,  so  5  is  an  isometry.  Exercises  19  and  20  in 
Section  6. A  show  that  inner  products  can  be  computed  from  norms.  Because 
5  preserves  norms,  this  implies  that  5  preserves  inner  products,  and  hence 
(b)  holds.  More  precisely,  if  V  is  a  real  inner  product  space,  then  for  every 
u,  v  e  V  we  have 

(Su,  Sv)  =  (||Su  +  Sv||2  -  \\Su  -  Sv||2)/4 
=  (\\S(u  +  v)\\2-\\S(u-v)\\2)/4 
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where  the  first  equality  comes  from  Exercise  19  in  Section  6. A,  the  second 
equality  comes  from  the  linearity  of  S ,  the  third  equality  holds  because  S  is  an 
isometry,  and  the  last  equality  again  comes  from  Exercise  19  in  Section  6. A. 
If  V  is  a  complex  inner  product  space,  then  use  Exercise  20  in  Section  6. A 
instead  of  Exercise  19  to  obtain  the  same  conclusion.  In  either  case,  we  see 
that  (b)  holds. 

Now  suppose  (b)  holds,  so  S  preserves  inner  products.  Suppose  that 
e\, . . . ,  en  is  an  orthonormal  list  of  vectors  in  V.  Then  we  see  that  the  list 
Se i, . . . ,  Sen  is  orthonormal  because  {Sej,  Se^)  =  {ej ,  e^).  Thus  (c)  holds. 

Clearly  (c)  implies  (d). 

Now  suppose  (d)  holds.  Let  e\ , . . . ,  en  be  an  orthonormal  basis  of  V  such 
that  Se\, . . . ,  Sen  is  orthonormal.  Thus 


(S*Sej,ek)  =  (ej,ek) 


for  j,k  =  1, [because  the  term  on  the  left  equals  {Sej,  Se k)  and 
(Se\, . . . ,  Sen)  is  orthonormal].  All  vectors  u,  v  e  V  can  be  written  as 
linear  combinations  of  e\ , . . . ,  en,  and  thus  the  equation  above  implies  that 
( S*Su,v )  =  (w,v).  Hence  S*S  =  /;  in  other  words,  (e)  holds. 

Now  suppose  (e)  holds,  so  that  S* S  =  /.  In  general,  an  operator  S  need 
not  commute  with  S*.  However,  S*S  =  I  if  and  only  if  S S*  =  /;  this  is  a 
special  case  of  Exercise  10  in  Section  3.D.  Thus  SS*  =  /,  showing  that  (f) 
holds. 

Now  suppose  (f)  holds,  so  SS*  =  /.  If  v  e  V,  then 


(S*v,S*v) 


(SS*v,  v)  =  (v,  v) 


Thus  5*  is  an  isometry,  showing  that  (g)  holds. 

Now  suppose  (g)  holds,  so  5*  is  an  isometry.  We  know  that  (a)  (e)  and 

(a)  =>  (f)  because  we  have  shown  (a)  =>►  (b)  =>  (c)  =>  (d)  (e)  (f).  Using 

the  implications  (a)  (e)  and  (a)  (f)  but  with  S  replaced  with  5*  [and 

using  the  equation  (5*)*  =  S],  we  conclude  that  SS*  =  I  and  S*S  =  I. 
Thus  S  is  invertible  and  S~x  =  5*;  in  other  words,  (h)  holds. 

Now  suppose  (h)  holds,  so  S  is  invertible  and  S'-1  =  S*.  Thus  S* S  =  I. 
If  v  G  V,  then 


||aSv||2  =  (Sv,  Sv)  =  (S*Sv,v)  =  (v,  v) 


Thus  S  is  an  isometry,  showing  that  (a)  holds. 

We  have  shown  (a)  =>►  (b)  =>>  (c)  =>  (d)  =>  (e)  =>►  (f)  =>►  (g)  =>►  (h)  =>►  (a), 
completing  the  proof.  ■ 
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The  previous  result  shows  that  every  isometry  is  normal  [see  (a),  (e),  and 
(f)  of  7.42].  Thus  characterizations  of  normal  operators  can  be  used  to  give 
descriptions  of  isometries.  We  do  this  in  the  next  result  in  the  complex  case 
and  in  Chapter  9  in  the  real  case  (see  9.36). 


7.43  Description  of  isometries  when  F  =  C 

Suppose  V  is  a  complex  inner  product  space  and  S  e  C(V).  Then  the 
following  are  equivalent: 

(a)  S  is  an  isometry. 

(b)  There  is  an  orthonormal  basis  of  V  consisting  of  eigenvectors  of  S 
whose  corresponding  eigenvalues  all  have  absolute  value  1. 


Proof  We  have  already  shown  (see  Example  7.38)  that  (b)  implies  (a). 

To  prove  the  other  direction,  suppose  (a)  holds,  so  S  is  an  isometry.  By  the 
Complex  Spectral  Theorem  (7.24),  there  is  an  orthonormal  basis  e\ , . . . ,  en 
of  V  consisting  of  eigenvectors  of  S.  For  j  e  {1, . . . ,  n},  let  Xj  be  the 
eigenvalue  corresponding  to  ej .  Then 


Thus  each  eigenvalue  of  S  has  absolute  value  1,  completing  the  proof.  ■ 


EXERCISES  7.C 


1  Prove  or  give  a  counterexample:  If  T  e  C(V)  is  self-adjoint  and  there 
exists  an  orthonormal  basis  e\ , . . . ,  en  of  V  such  that  ( Tej  ,ej )  >0  for 
each  j ,  then  T  is  a  positive  operator. 

2  Suppose  T  is  a  positive  operator  on  V.  Suppose  v,  w  e  V  are  such  that 

Tv  —  w  and  Tw  =  v. 

Prove  that  v  =  w. 

3  Suppose  T  is  a  positive  operator  on  V  and  U  is  a  subspace  of  V  invariant 
under  T.  Prove  that  T\jj  e  C{U)  is  a  positive  operator  on  U. 

4  Suppose  T  e  C(V,  W).  Prove  that  T*T  is  a  positive  operator  on  V  and 
T  T*  is  a  positive  operator  on  W. 
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5  Prove  that  the  sum  of  two  positive  operators  on  V  is  positive. 

6  Suppose  T  G  jC(V)  is  positive.  Prove  that  Tk  is  positive  for  every 
positive  integer  k. 

7  Suppose  T  is  a  positive  operator  on  V.  Prove  that  T  is  invertible  if  and 
only  if 

(T v,  v)  >0 

for  every  v  G  V  with  v/0. 

8  Suppose  T  g  jC(V).  For  u,v  e  V,  define  ( u,v)t  by 

(u,  v) t  =  ( T u ,  v). 

Prove  that  (•,  -)t  is  an  inner  product  on  V  if  and  only  if  T  is  an  invertible 
positive  operator  (with  respect  to  the  original  inner  product  (•,  •)). 

9  Prove  or  disprove:  the  identity  operator  on  F2  has  infinitely  many  self- 
adjoint  square  roots. 

10  Suppose  S  G  jC(V).  Prove  that  the  following  are  equivalent: 

(a)  S  is  an  isometry; 

(b)  ( S*u ,  S*v)  =  (u,  v)  for  all  u,  v  G  V; 

(c)  S*e\ , . . . ,  S*em  is  an  orthonormal  list  for  every  orthonormal  list 
of  vectors  e\ , . . . ,  em  in  V; 

(d)  S*ei, . . . ,  S*en  is  an  orthonormal  basis  for  some  orthonormal 
basis  e\,...,enofV. 

11  Suppose  T\,T2  are  normal  operators  on  £(F3)  and  both  operators  have 
2,  5, 7  as  eigenvalues.  Prove  that  there  exists  an  isometry  S  G  £(F3) 
such  that  T\  —  S'* 72 S'. 

12  Give  an  example  of  two  self-adjoint  operators  T\,T2  G  >C(F4)  such  that 
the  eigenvalues  of  both  operators  are  2,  5,  7  but  there  does  not  exist  an 
isometry  S  G  £(F4)  such  that  T\  =  S*?2S.  Be  sure  to  explain  why 
there  is  no  isometry  with  the  required  property. 

13  Prove  or  give  a  counterexample:  if  S  G  C(V)  and  there  exists  an  ortho¬ 
normal  basis  e\ , . . . ,  en  of  V  such  that  ||  Sej  ||  =  1  for  each  ej ,  then  S 
is  an  isometry. 

14  Let  T  be  the  second  derivative  operator  in  Exercise  21  in  Section  7.A. 
Show  that  —T  is  a  positive  operator. 
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7.D 


Polar  Decomposition  and  Singular 
Value  Decomposition 


Polar  Decomposition 

Recall  our  analogy  between  C  and  C{V).  Under  this  analogy,  a  complex 
number  z  corresponds  to  an  operator  T,  and  z  corresponds  to  T*.  The  real 
numbers  (z  =  z)  correspond  to  the  self-adjoint  operators  (T  =  T*),  and  the 
nonnegative  numbers  correspond  to  the  (badly  named)  positive  operators. 

Another  distinguished  subset  of  C  is  the  unit  circle,  which  consists  of  the 
complex  numbers  z  such  that  |z|  =  1.  The  condition  |z|  =  1  is  equivalent 
to  the  condition  zz  =  1 .  Under  our  analogy,  this  would  correspond  to  the 
condition  T*T  =  /,  which  is  equivalent  to  T  being  an  isometry  (see  7.42). 
In  other  words,  the  unit  circle  in  C  corresponds  to  the  isometries. 

Continuing  with  our  analogy,  note  that  each  complex  number  z  except  0 
can  be  written  in  the  form 


z  = 


where  the  first  factor,  namely,  z/|z|,  is  an  element  of  the  unit  circle.  Our 
analogy  leads  us  to  guess  that  each  operator  T  g  C{V)  can  be  written  as  an 
isometry  times  \/T*T.  That  guess  is  indeed  correct,  as  we  now  prove  after 
defining  the  obvious  notation,  which  is  justified  by  7.36. 


7.44  Notation  Vt 

If  T  is  a  positive  operator,  then  \/T  denotes  the  unique  positive  square 
root  of  T. 


Now  we  can  state  and  prove  the  Polar  Decomposition,  which  gives  a 
beautiful  description  of  an  arbitrary  operator  on  V.  Note  that  T*T  is  a 
positive  operator  for  every  T  g  and  thus  VT*T  is  well  defined. 

7.45  Polar  Decomposition 

Suppose  T  G  C(V).  Then  there  exists  an  isometry  S  G  C(V)  such  that 

T  =  sVf*T. 
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Proof  If  v  G  V,  then 

lirvf  =  (Tv,  Tv)  =  (T*Tv,  v) 

=  {Vt*tVt*Tv,  v) 
=  {Vt*Tv,  V¥*Tv) 
=  nVr^rvii2. 


Thus 

7.46 


IRvii 


VT*Tv  || 


for  all  v  G  V. 

Define  a  linear  map  S i :  range  y/T*T  — >  range  T  by 


7.47 


Si(Vf*Tv)  =  Tv. 


The  idea  of  the  proof  is  to  extend  S\  to  an  isometry  S  G  C(V)  such  that 
T  =  S \/T*7\  Now  for  the  details. 

First  we  must  check  that  S \  is  well  defined.  To  do  this,  suppose  v\ ,  V2  €  V 
are  such  that  VT*T v\  =  V2-  For  the  definition  given  by  7.47  to  make 

sense,  we  must  show  that  Tvi  =  Tv 2.  Note  that 


\\Tvi-Tv2\\  =  II  ^(vt  —  v2)  || 

=  \\Vf*f(vi  -v2)|| 

=  \\VT*Tvi  -  VT*Tv 2 II 

=  0, 


where  the  second  equality  holds  by  7.46.  The  equation  above  shows  that 
Tv  1  =  Tv 2,  so  £1  is  indeed  well  defined.  You  should  verify  that  S \  is  a 
linear  map. 

We  see  from  7.47  that  S\  maps  range  a/T*T  onto  range  T.  Clearly  7.46 
and  7.47  imply  that 


IISiw 


for  all  u  G  range  VT*7\ 


77z<?  rest  of  the  proof  extends  S\  to 
an  isometry  S  on  all  of  V. 

In  particular,  S 1  is  injective.  Thus 
from  the  Fundamental  Theorem  of  Lin¬ 
ear  Maps  (3.22),  applied  to  S 1,  we  have 


dim  range  =  dim  range  7". 
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This  implies  that  dim(range  ^/T^T)1-  =  dim(range  T )-L  (see  6.50). 
Thus  orthonormal  bases  e\, . . . ,  em  of  (range  VT^T)1-  and  /i, . . . ,  fm 
of  (range  T)1-  can  be  chosen;  the  key  point  here  is  that  these  two  ortho¬ 
normal  bases  have  the  same  length  (denoted  m).  Now  define  a  linear  map 
S2 :  (range  ->  (range  T)1-  by 

5*2 (<2 1^1  +  •  •  *  +  CLm^m)  =  fl  +  *  *  *  +  Urnfm- 

For  all  w  G  (range  we  have  || Sort'll  =  ||w||  (from  6.25). 

Now  let  S  be  the  operator  on  V  that  equals  S i  on  range  \[T*T  and  equals 
S2  on  (range  More  precisely,  recall  that  each  v  e  V  can  be  written 

uniquely  in  the  form 

7.48  v  =  u  +  w, 

where  u  G  range  and  w  G  (range  ^/T^T)1-  (see  6.47).  For  v  G  V 

with  decomposition  as  above,  define  Svby 

Sv  =  S\u  +  S2W. 


For  each  v  G  V  we  have 


S(Vt*Tv )  =  Si(Vr*7V)  =  Tv, 

so  T  =  S \/T*T,  as  desired.  All  that  remains  is  to  show  that  S  is  an  isometry. 
However,  this  follows  easily  from  two  uses  of  the  Pythagorean  Theorem:  if 
v  G  V  has  decomposition  as  in  7.48,  then 


Sv||2  =  \\SiU  +  S2w  ||2  =  ||5in||2  +  ||S2w||2  = 


u 


+ 


w 


2  _  11  11 2 . 


the  second  equality  holds  because  S\u  G  range  T  and  S2W  G  (range  T) 


_L 


The  Polar  Decomposition  (7.45)  states  that  each  operator  on  V  is  the 
product  of  an  isometry  and  a  positive  operator.  Thus  we  can  write  each 
operator  on  V  as  the  product  of  two  operators,  each  of  which  comes  from 
a  class  that  we  can  completely  describe  and  that  we  understand  reasonably 
well.  The  isometries  are  described  by  7.43  and  9.36;  the  positive  operators 
are  described  by  the  Spectral  Theorem  (7.24  and  7.29). 

Specifically,  consider  the  case  F  =  C,  and  suppose  T  =  S^/T*T  is  a 
Polar  Decomposition  of  an  operator  T  G  C(V),  where  S  is  an  isometry.  Then 
there  is  an  orthonormal  basis  of  V  with  respect  to  which  S  has  a  diagonal 
matrix,  and  there  is  an  orthonormal  basis  of  V  with  respect  to  which  \/T*T 
has  a  diagonal  matrix.  Warning:  there  may  not  exist  an  orthonormal  basis 
that  simultaneously  puts  the  matrices  of  both  S  and  into  these  nice 

diagonal  forms.  In  other  words,  S  may  require  one  orthonormal  basis  and 
may  require  a  different  orthonormal  basis. 
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Singular  Value  Decomposition 

The  eigenvalues  of  an  operator  tell  us  something  about  the  behavior  of  the 
operator.  Another  collection  of  numbers,  called  the  singular  values,  is  also 
useful.  Recall  that  eigenspaces  and  the  notation  E  are  defined  in  5.36. 

7.49  Definition  singular  values 

Suppose  T  g  jC(V).  The  singular  values  of  T  are  the  eigenvalues 
of  VT*7\  with  each  eigenvalue  X  repeated  dim  E(X,  \/T*T)  times. 

The  singular  values  of  T  are  all  nonnegative,  because  they  are  the  eigen¬ 
values  of  the  positive  operator  \/T*T. 

7.50  Example  Define  T  g  £(F4)  by 

T(zi,z2,z3,z4)  =  (0,  3zi,2z2,-3z4). 

Find  the  singular  values  of  T. 

Solution  A  calculation  shows  T*T(z\,  z2,  z3,  z4)  =  (9zi,  4z2, 0,  9z4),  as 
you  should  verify.  Thus 


vW(zi,z2,Z3,z4)  =  (3zi,2z2,0,3z4), 
and  we  see  that  the  eigenvalues  of  V T*T  are  3, 2, 0  and 

dim  E(3,  VT*T)  =  2,  dim£(2,  Vf*T)  =  1,  dimfCO,  Vf*T)  =  1. 

Hence  the  singular  values  of  T  are  3,  3,  2, 0. 

Note  that  —3  and  0  are  the  only  eigenvalues  of  T.  Thus  in  this  case,  the 
collection  of  eigenvalues  did  not  pick  up  the  number  2  that  appears  in  the 
definition  (and  hence  the  behavior)  of  T,  but  the  collection  of  singular  values 
does  include  2. 


Each  T  G  C(V)  has  dim  V  singular  values,  as  can  be  seen  by  applying 
the  Spectral  Theorem  and  5.41  [see  especially  part  (e)]  to  the  positive  (hence 
self-adjoint)  operator  \/T*T.  For  example,  the  operator  T  defined  in  Exam¬ 
ple  7.50  on  the  four-dimensional  vector  space  F4  has  four  singular  values 
(they  are  3,  3, 2,  0),  as  we  saw  above. 

The  next  result  shows  that  every  operator  on  V  has  a  clean  description  in 
terms  of  its  singular  values  and  two  orthonormal  bases  of  V. 
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7.51  Singular  Value  Decomposition 

Suppose  T  e  C(V)  has  singular  values  s\,...,sn.  Then  there  exist 
orthonormal  bases  e\ , . . . ,  en  and  f\ , . . . ,  fn  of  V  such  that 


Tv  =  si{v,ei)fi  H - 1-  sn (v,  en)  fn 


for  every  v  e  V. 


Proof  By  the  Spectral  Theorem  applied  to  there  is  an  orthonormal 

basis  e\ , . . . , en  of  V  such  that  *fT*T ej  —  sjej  for  j  =  1 , ,n. 

We  have 

v  =  (v,  e\)e\  H - b  {v,en)en 

for  every  v  G  V  (see  6.30).  Apply  \[T*T  to  both  sides  of  this  equation, 
getting 

v  =  si(v,e\)ei  H - b  sn(v,en)en 

for  every  v  G  V.  By  the  Polar  Decomposition  (see  7.45),  there  is  an  isometry 
S  e  C(V)  such  that  T  =  S^/T*T .  Apply  S  to  both  sides  of  the  equation 
above,  getting 

Tv  =  S\  (v,  e\)Se\  H - h  sn(v,en)Sen 

for  every  v  e  V.  For  each  /,  let  fj  =  Sej.  Because  S  is  an  isometry, 
/i , . . . ,  fn  is  an  orthonormal  basis  of  V  (see  7.42).  The  equation  above  now 
becomes 

Tv  =  si{v,ei)fi  4 - h  sn(v,en)fn 

for  every  v  e  V,  completing  the  proof.  ■ 


When  we  worked  with  linear  maps  from  one  vector  space  to  a  second 
vector  space,  we  considered  the  matrix  of  a  linear  map  with  respect  to  a  basis 
of  the  first  vector  space  and  a  basis  of  the  second  vector  space.  When  dealing 
with  operators,  which  are  linear  maps  from  a  vector  space  to  itself,  we  almost 
always  use  only  one  basis,  making  it  play  both  roles. 

The  Singular  Value  Decomposition  allows  us  a  rare  opportunity  to  make 
good  use  of  two  different  bases  for  the  matrix  of  an  operator.  To  do  this, 
suppose  T  g  C(V).  Let  . . . ,  sn  denote  the  singular  values  of  T,  and  let 
e\ , . . . ,  en  and  f\ , . . . ,  fn  be  orthonormal  bases  of  V  such  that  the  Singular 
Value  Decomposition  7.51  holds.  Because  Tej  =  Sj  fj  for  each  / ,  we  have 


(  Si  0  ^ 

\  0  Sn  J 


M(T,  Oi, 


•  •  • 


>  enf  if l  j  •  •  •  >  /«)) 
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In  other  words,  every  operator  on  V  has  a  diagonal  matrix  with  respect 
to  some  orthonormal  bases  of  V,  provided  that  we  are  permitted  to  use  two 
different  bases  rather  than  a  single  basis  as  customary  when  working  with 
operators. 

Singular  values  and  the  Singular  Value  Decomposition  have  many  applica¬ 
tions  (some  are  given  in  the  exercises),  including  applications  in  computational 
linear  algebra.  To  compute  numeric  approximations  to  the  singular  values  of 
an  operator  T,  first  compute  T*T  and  then  compute  approximations  to  the 
eigenvalues  of  T*T  (good  techniques  exist  for  approximating  eigenvalues 
of  positive  operators).  The  nonnegative  square  roots  of  these  (approximate) 
eigenvalues  of  T*T  will  be  the  (approximate)  singular  values  of  T.  In  other 
words,  the  singular  values  of  T  can  be  approximated  without  computing  the 
square  root  of  T*T.  The  next  result  helps  justify  working  with  T*T  instead 
of 

7.52  Singular  values  without  taking  square  root  of  an  operator 

Suppose  T  G  C(V).  Then  the  singular  values  of  T  are  the  nonnegative 
square  roots  of  the  eigenvalues  of  with  each  eigenvalue  A  repeated 
dim£'(A,  T*T)  times. 

Proof  The  Spectral  Theorem  implies  that  there  are  an  orthonormal  basis 
e\, . . . ,  en  and  nonnegative  numbers  Ai, . . . , Xn  such  that  T*Tej  —  A jej 
for  j  =  1, ... ,  n.  It  is  easy  to  see  that  VT*T ej  =  y/^Jej  for  j  =  1, ... ,  n , 
which  implies  the  desired  result.  ■ 


EXERCISES  7.D 


1  Fix  u,  x  G  V  with  n  /  0.  Define  T  G  C(V )  by 

Tv  =  (v,  u)x 


for  every  v  G  V.  Prove  that 


VT*Tv 


(v,  u)u 


for  every  v  e  V. 

2  Give  an  example  of  T  g  C( C1 2)  such  that  0  is  the  only  eigenvalue  of  T 
and  the  singular  values  of  T  are  5,  0. 
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3 


4 

5 

6 


7 


8 


9 

10 

11 

12 

13 

14 


Suppose  T  e  £(  V).  Prove  that  there  exists  an  isometry  S  e  £(  V)  such 
that 

T  =  Vff*  S. 


Suppose  T  G  C(V )  and  s  is  a  singular  value  of  T.  Prove  that  there  exists 
a  vector  v  g  V  such  that  ||v||  =  1  and  ||!Tv||  =  s. 


Suppose  T  G  £(C2)  is  defined  by  T(x,  y)  =  (—4 y,  x).  Find  the  singu¬ 
lar  values  of  T. 


Find  the  singular  values  of  the  differentiation  operator  D  G  V(R2) 
defined  by  Dp  =  p\  where  the  inner  product  on  ^(R2)  is  as  in  Example 


6.33. 


Define  T  G  £(F3)  by 


T(zi,z2,z3)  =  (z3,2zi,3z2). 

Find  (explicitly)  an  isometry  S  G  >C(F3)  such  that  T  =  S  VF*7\ 

Suppose  T  G  C(V ),  S  G  C(V)  is  an  isometry,  and  R  G  C(V)  is  a 
positive  operator  such  that  T  =  SR.  Prove  that  R  =  VF*7\ 

exercise  above  shows  that  if  we  write  T  as  the  product  of  an  isometry 
and  a  positive  operator  (as  in  the  Polar  Decomposition  7.45),  then  the 
positive  operator  equals  «fT*T.] 

Suppose  T  G  C(V).  Prove  that  T  is  invertible  if  and  only  if  there  exists 
a  unique  isometry  S  G  C(V )  such  that  T  =  S  *fT*T . 

Suppose  T  G  C(V )  is  self-adjoint.  Prove  that  the  singular  values  of  T 
equal  the  absolute  values  of  the  eigenvalues  of  T,  repeated  appropriately. 

Suppose  T  g  C(V).  Prove  that  T  and  F*  have  the  same  singular  values. 

Prove  or  give  a  counterexample:  if  T  G  C(V),  then  the  singular  values 
of  F2  equal  the  squares  of  the  singular  values  of  T. 

Suppose  T  G  C(V).  Prove  that  T  is  invertible  if  and  only  if  0  is  not  a 
singular  value  of  T. 

Suppose  T  G  C(V).  Prove  that  dimrangeF  equals  the  number  of 
nonzero  singular  values  of  T. 


15  Suppose  S  G  C(V).  Prove  that  S  is  an  isometry  if  and  only  if  all  the 
singular  values  of  S  equal  1 . 
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16  Suppose  T\ ,  T2  e  C(V).  Prove  that  T\  and  have  the  same  singular 
values  if  and  only  if  there  exist  isometries  S\,  S2  €  C(V)  such  that 
T\  =  S1T2S2. 

17  Suppose  T  E  C(V)  has  singular  value  decomposition  given  by 

Tv  =  si{v,ei)fi  H - b  sn (v,  en)  fn 


for  every  v  E  V,  where  ... ,sn  are  the  singular  values  of  T  and 
e\ , . . . ,  en  and  f\ , . . . ,  fn  are  orthonormal  bases  of  V. 

(a)  Prove  that  if  v  E  V,  then 

T*v  =  s i(v,  /i)pi  H - h  ^(v,  fn)en- 

(b)  Prove  that  if  v  E  V,  then 

r*Tv  =  si2(v,e \)e\  H - b  sn2(v,en)en. 

(c)  Prove  that  if  v  E  V,  then 

Vt*T v  =  s\(v,ei)ei  H - b  sn{v,en)en. 


(d)  Suppose  T  is  invertible.  Prove  that  if  v  E  V,  then 


ry-r  — 1  (V,/l)^l  ,  ,  (V,  //,  )  ^/, 

1  v  = - b  *  *  *  H - 


•Si 


for  every  vE  f. 


18  Suppose  T  e  £(P).  Let  £  denote  the  smallest  singular  value  of  T,  and 
let  s  denote  the  largest  singular  value  of  T. 


(a) 

(b) 


Prove  that  5 1| v ||  <  ||Lv||  <  a  ||  v||  for  every  v  E  V. 


luppose  X  is  an  eigenvalue  of  T.  Prove  that  s  <  \X 


<  s 


19  Suppose  T  e  C(V).  Show  that  T  is  uniformly  continuous  with  respect 
to  the  metric  d  on  V  defined  by  d(u,  v)  =  \\u  —  v|  . 

20  Suppose  S,T  e  jC(V).  Let  s  denote  the  largest  singular  value  of  S , 
let  t  denote  the  largest  singular  value  of  T,  and  let  r  denote  the  largest 
singular  value  of  S  +  T.  Prove  that  r  <  s  +  t. 


CHAPTER 

8 

Hypatia,  the  5th  century  Egyptian 
mathematician  and  philosopher,  as 
envisioned  around  1900  by  Alfred 
Seifert. 


Operators  on  Complex 
Vector  Spaces 


In  this  chapter  we  delve  deeper  into  the  structure  of  operators,  with  most  of 
the  attention  on  complex  vector  spaces.  An  inner  product  does  not  help  with 
this  material,  so  we  return  to  the  general  setting  of  a  finite-dimensional  vector 
space.  To  avoid  some  trivialities,  we  will  assume  that  V  ^  {0}.  Thus  our 
assumptions  for  this  chapter  are  as  follows: 

8.1  Notation  F,  V 

•  F  denotes  R  or  C. 

•  V  denotes  a  finite-dimensional  nonzero  vector  space  over  F. 


LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  generalized  eigenvectors  and  generalized  eigenspaces 

■  characteristic  polynomial  and  the  Cayley-Hamilton  Theorem 

■  decomposition  of  an  operator 

■  minimal  polynomial 

■  Jordan  Form 
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Generalized  Eigenvectors  and  Nilpotent 
Operators 

Null  Spaces  of  Powers  of  an  Operator 

We  begin  this  chapter  with  a  study  of  null  spaces  of  powers  of  an  operator. 

8.2  Sequence  of  increasing  null  spaces 
Suppose  T  e  C(V).  Then 

{0}  =  null  T°  C  null  Tl  C  •  •  •  C  null  Tk  C  null  Tk+l  C  •  •  •  . 


8.A 


Proof  Suppose  k  is  a  nonnegative  integer  and  v  G  null  Tk .  Then  Tkv  =  0, 
and  hence  Tk+lv  =  T(Tkv)  =  T( 0)  =  0.  Thus  v  G  null7^+1.  Hence 
null  Tk  C  null  Tk+l ,  as  desired.  ■ 

The  next  result  says  that  if  two  consecutive  terms  in  this  sequence  of 
subspaces  are  equal,  then  all  later  terms  in  the  sequence  are  equal. 


8.3  Equality  in  the  sequence  of  null  spaces 

Suppose  T  G  C(V).  Suppose  m  is  a  nonnegative  integer  such  that 
null  Tm  =  null  Tm+1.  Then 

null  Tm  =  null  Tm+l  =  null  Tm+ 2  =  null  Tm+ 3  =  •  •  •  . 

Proof  Let  k  be  a  positive  integer.  We  want  to  prove  that 

null  Tm+k  =  null  Tm+k+l . 

We  already  know  from  8.2  that  null  Tm+k  C  null  Tm+k+l . 

To  prove  the  inclusion  in  the  other  direction,  suppose  v  G  null  Tm+k+1 . 
Then 

Tm+i(Tkv)  =  Tm+k+lv  =  0. 

Hence 

Tkv  G  nulirm+1  =  nulirm. 

Thus  Tm+kv  =  Tm(Tkv)  =  0,  which  means  that  v  G  nullTm+^.  This 
implies  that  null  Tm+kJtl  C  null  Tm+k ,  completing  the  proof.  ■ 
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The  proposition  above  raises  the  question  of  whether  there  exists  a  non¬ 
negative  integer  m  such  that  null  Tm  =  null  T 171+1 .  The  proposition  below 
shows  that  this  equality  holds  at  least  when  m  equals  the  dimension  of  the 
vector  space  on  which  T  operates. 

8.4  Null  spaces  stop  growing 
Suppose  T  G  C(V).  Let  n  =  dim  V.  Then 

null  Tn  =  nulir”+1  =  xm\\Tn+2  =  •••  . 

Proof  We  need  only  prove  that  null  /  "  =  null  Tn+1  (by  8.3).  Suppose  this 
is  not  true.  Then,  by  8.2  and  8.3,  we  have 

{0}  =  null  T°  c  null  T1  c  - . .  c  null  Tn  c  null  Tn+1 , 

where  the  symbol  c  means  “contained  in  but  not  equal  to”.  At  each  of  the 
strict  inclusions  in  the  chain  above,  the  dimension  increases  by  at  least  1 . 
Thus  dim  null  Tn+1  >  n  +  1,  a  contradiction  because  a  subspace  of  V  cannot 
have  a  larger  dimension  than  n .  ■ 

Unfortunately,  it  is  not  true  that  V  =  null  T  ©  range  T  for  each  T  G  C(V). 
However,  the  following  result  is  a  useful  substitute. 

8.5  V  is  the  direct  sum  of  null  TdimV  and  range  TdimV 
Suppose  T  G  jC(V).  Let  n  =  dim  V.  Then 

V  =  null  Tn  ©  range  Tn . 

Proof  First  we  show  that 

8.6  (null  Tn )  D  (range  Tn )  =  {0}. 

Suppose  v  G  (null  Tn)  C 1  (range  Tn).  Then  Tnv  =  0,  and  there  exists  u  G  V 
such  that  v  =  Tnu.  Applying  Tn  to  both  sides  of  the  last  equation  shows  that 
Tnv  =  T2nu.  Hence  T2nu  =  0,  which  implies  that  Tnu  =  0  (by  8.4).  Thus 
v  =  Tnu  —  0,  completing  the  proof  of  8.6. 

Now  8.6  implies  that  null  Tn  +  range  Tn  is  a  direct  sum  (by  1.45).  Also, 

dim  (null  Tn  ©  range  Tn)  =  dim  null  Tn  +  dim  range  Tn  =  dim  V, 

where  the  first  equality  above  comes  from  3.78  and  the  second  equality  comes 
from  the  Fundamental  Theorem  of  Linear  Maps  (3.22).  The  equation  above 
implies  that  null  Tn  ©  range  Tn  =  V,  as  desired.  ■ 
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8.7  Example  Suppose  T  e  £(F3)  is  defined  by 

T(z i,z2,z3)  =  (4z2,0,  5z3). 

For  this  operator,  null  T  +  range  T  is  not  a  direct  sum  of  subspaces,  because 
nullF  =  {(zi,0, 0)  :  z\  e  F}  and  ranged  =  {(zi,0,  Z3)  :  zi,Z3  e  F}. 
Thus  null  T  Pi  range  T  7^  {0}  and  hence  null  T  +  range  T  is  not  a  direct  sum. 
Also  note  that  null  T  +  range  f  /  F3 . 

However,  we  have  T3(zi,Z2,Z3)  =  (0,0,  125z3).  Thus  we  see  that 
nullT3  =  {(zi,Z2,0)  :  zi,Z2  €  F}  and  range  T3  =  {(0,  0,  Z3)  :  Z3  e  F}. 
Hence  F3  =  null  T3  ©  range  T3. 


Generalized  Eigenvectors 

Unfortunately,  some  operators  do  not  have  enough  eigenvectors  to  lead  to 
a  good  description.  Thus  in  this  subsection  we  introduce  the  concept  of 
generalized  eigenvectors,  which  will  play  a  major  role  in  our  description  of 
the  structure  of  an  operator. 

To  understand  why  we  need  more  than  eigenvectors,  let’s  examine  the 
question  of  describing  an  operator  by  decomposing  its  domain  into  invariant 
subspaces.  Fix  T  e  jC(V).  We  seek  to  describe  T  by  finding  a  “nice”  direct 
sum  decomposition 

V  =  U\  ©  •  •  •  ©  Um, 

where  each  U j  is  a  subspace  of  V  invariant  under  T.  The  simplest  possible 
nonzero  invariant  subspaces  are  1 -dimensional.  A  decomposition  as  above 
where  each  U j  is  a  1 -dimensional  subspace  of  V  invariant  under  T  is  possible 
if  and  only  if  V  has  a  basis  consisting  of  eigenvectors  of  T  (see  5.41).  This 
happens  if  and  only  if  V  has  an  eigenspace  decomposition 

8.8  V  =  E(XuT)®---®E(Xm,T), 

where  X\ , . . . ,  Xm  are  the  distinct  eigenvalues  of  T  (see  5.41). 

The  Spectral  Theorem  in  the  previous  chapter  shows  that  if  V  is  an  inner 
product  space,  then  a  decomposition  of  the  form  8.8  holds  for  every  normal 
operator  if  F  =  C  and  for  every  self-adjoint  operator  if  F  =  R  because 
operators  of  those  types  have  enough  eigenvectors  to  form  a  basis  of  V  (see 
7.24  and  7.29). 
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Sadly,  a  decomposition  of  the  form  8.8  may  not  hold  for  more  general  oper¬ 
ators,  even  on  a  complex  vector  space.  An  example  was  given  by  the  operator 
in  5.43,  which  does  not  have  enough  eigenvectors  for  8.8  to  hold.  General¬ 
ized  eigenvectors  and  generalized  eigenspaces,  which  we  now  introduce,  will 
remedy  this  situation. 

8.9  Definition  generalized  eigenvector 

Suppose  T  G  jC(V)  and  A  is  an  eigenvalue  of  T.  A  vector  v  G  V  is  called 

a  generalized  eigenvector  of  T  corresponding  to  X  if  v  ^  0  and 

(T  -XI)jv  =  0 

for  some  positive  integer  j . 


Although  j  is  allowed  to  be  an  arbi¬ 
trary  integer  in  the  equation 

(T  -XI)jv  =  0 

in  the  definition  of  a  generalized  eigen¬ 
vector,  we  will  soon  prove  that  every 
generalized  eigenvector  satisfies  this 
equation  with  j  =  dim  V. 


Note  that  we  do  not  define  the  con¬ 
cept  of  a  generalized  eigenvalue, 
because  this  would  not  lead  to  any¬ 
thing  new.  Reason:  if  (T —  XI) J  is 
not  injective  for  some  positive  inte¬ 
ger  j ,  then  T  —  XI  is  not  injective, 
and  hence  X  is  an  eigenvalue  of  T. 

8.10  Definition  generalized  eigenspace,  G(A,  T ) 

Suppose  T  g  C(V )  and  X  G  F.  The  generalized  eigenspace  of  T  corre¬ 
sponding  to  A,  denoted  G(X,  T ),  is  defined  to  be  the  set  of  all  generalized 
eigenvectors  of  T  corresponding  to  A,  along  with  the  0  vector. 


Because  every  eigenvector  of  T  is  a  generalized  eigenvector  of  T  (take 
j  =  1  in  the  definition  of  generalized  eigenvector),  each  eigenspace  is 
contained  in  the  corresponding  generalized  eigenspace.  In  other  words,  if 
T  g  C(V)  and  A  G  F,  then 


E(X,T)  C  G(A,  T). 

The  next  result  implies  that  if  T  G  C(V)  and  A  G  F,  then  G( A,  T )  is  a 
subspace  of  V  (because  the  null  space  of  each  linear  map  on  V  is  a  subspace 
of  V). 
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8.1 1  Description  of  generalized  eigenspaces 

Suppose  T  e  C(V)  and  A  e  F.  Then  G(A,  T)  =  null(T  -  A/)dimV 


Proof  Suppose  v  e  null(T  —  A/)dimF.  The  definitions  imply  v  e  G( A,  T). 
Thus  G(A,  T)  D  null(T  -  A/)dimF. 

Conversely,  suppose  v  g  G(A,  T).  Thus  there  is  a  positive  integer  j  such 
that 

v  g  nuii(r  —  xiy . 

From  8.2  and  8.4  (with  T  —  XI  replacing  T ),  we  get  v  G  null(T  —  XI)dimV . 
Thus  G(A,  T)  C  null(T  —  XI)dimV ,  completing  the  proof.  ■ 


8.12  Example  Define  T  e  C3)  by 


T(zi,Z2,z3)  =  (4z2,0,5z3). 

(a)  Find  all  eigenvalues  of  T,  the  corresponding  eigenspaces,  and  the 
corresponding  generalized  eigenspaces. 

(b)  Show  that  C3  is  the  direct  sum  of  generalized  eigenspaces  correspond¬ 
ing  to  the  distinct  eigenvalues  of  T. 

Solution 

(a)  A  routine  use  of  the  definition  of  eigenvalue  shows  that  the  eigenvalues 
of  T  are  0  and  5.  The  corresponding  eigenspaces  are  easily  seen  to  be 
E( 0,  T)  =  {(zi,0,0)  :  zi  g  C}  and£(5,T)  =  {(0,0, z3)  :  z3  g  C}. 

Note  that  this  operator  T  does  not  have  enough  eigenvectors  to  span  its 
domain  C3. 

We  have  T3(zi,  z2,  z3)  =  (0,  0, 125z3)  for  all  zi,  z2,  z3  G  C.  Thus 
8.11  implies  that  G(0,  T)  =  {(zi,  z2, 0)  :  z\,  z2  G  C}. 

We  have  (T  —  5/)3(zi,  z2,  z3)  =  (— 125zi  +  300z2,  — 125z2, 0).  Thus 
8.11  implies  that  G(5,  T)  =  {(0,  0,  z3)  :  z3  G  C}. 

(b)  The  results  in  part  (a)  show  that  C3  =  G(0,T)  ©  G(5,T). 
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One  of  our  major  goals  in  this  chapter  is  to  show  that  the  result  in  part  (b) 
of  the  example  above  holds  in  general  for  operators  on  finite-dimensional 
complex  vector  spaces;  we  will  do  this  in  8.21. 

We  saw  earlier  (5.10)  that  eigenvectors  corresponding  to  distinct  eigenval¬ 
ues  are  linearly  independent.  Now  we  prove  a  similar  result  for  generalized 
eigenvectors. 

8.13  Linearly  independent  generalized  eigenvectors 

Let  T  e  C(V).  Suppose  Ai, . . . ,  Xm  are  distinct  eigenvalues  of  T  and 
vi , . . . ,  vm  are  corresponding  generalized  eigenvectors.  Then  v\ , . . . ,  vm 
is  linearly  independent. 


Proof  Suppose  a\ , . . . ,  am  are  complex  numbers  such  that 
8.14  0  =  a\V\  +  •  •  •  + 

Let  k  be  the  largest  nonnegative  integer  such  that  ( T  —  X\I)kv\  =A  0.  Let 

w=  (T -\iI)kVl. 


Thus 

(' T  -  Ai I)w  =  (T  -  Ai I)k+1w  =  0, 

and  hence  Tw  =  Aiw.  Thus  (T  —  A I)w  =  (Ai  —  A)w  for  every  A  e  F  and 
hence 


8.15  (T  -\I)nw  =  (\l-X)nw 

for  every  A  e  F,  where  n  =  dim  V. 

Apply  the  operator 

(T  -  X iI)k(T  -  X 2I)n  ■■■(T-  xmiy 

to  both  sides  of  8.14,  getting 

0  =  ai(T  -XiI)k(T  -X2I)n---(T  -Am/)"vi 
=  ai(T  —  X2I)n  ■  ■  ■  (T  —  XmI)nw 
=  al(Xl-X2)n---(Xl-Xm)nw, 

where  we  have  used  8.11  to  get  the  first  equation  above  and  8.15  to  get  the 
last  equation  above. 

The  equation  above  implies  that  a\  —  0.  In  a  similar  fashion,  a j  =  0  for 
each  j ,  which  implies  that  v\ , . . . ,  vm  is  linearly  independent.  ■ 
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Nilpotent  Operators 

8.16  Definition  nilpotent 

An  operator  is  called  nilpotent  if  some  power  of  it  equals  0. 


8.17  Example  nilpotent  operators 

(a)  The  operator  N  e  £(F4)  defined  by 

A(zi,z2,z3,z4)  =  (z3,  z4, 0, 0) 
is  nilpotent  because  N2  —  0. 

(b)  The  operator  of  differentiation  on  Vm(R)  is  nilpotent  because  the 
( m  +  l)st  derivative  of  every  polynomial  of  degree  at  most  m  equals  0. 
Note  that  on  this  space  of  dimension  m  +  1,  we  need  to  raise  the 
nilpotent  operator  to  the  power  m  +  1  to  get  the  0  operator. 


The  Latin  word  nil  means  noth¬ 
ing  or  zero;  the  Latin  word  potent 
means  power  Thus  nilpotent  liter¬ 
ally  means  zero  power. 


The  next  result  shows  that  we  never 
need  to  use  a  power  higher  than  the  di¬ 
mension  of  the  space. 


8.18  Nilpotent  operator  raised  to  dimension  of  domain  is  0 
Suppose  N  e  C{V)  is  nilpotent.  Then  A ldimV  =  0. 

Proof  Because  N  is  nilpotent,  G(0,  N)  =  V.  Thus  8.11  implies  that 
null  A^dimF  =  V,  as  desired.  ■ 

Given  an  operator  T  on  V,  we  want  to  find  a  basis  of  V  such  that  the 
matrix  of  T  with  respect  to  this  basis  is  as  simple  as  possible,  meaning  that 
the  matrix  contains  many  0’s. 

The  next  result  shows  that  if  N  is 
nilpotent,  then  we  can  choose  a  basis 
of  V  such  that  the  matrix  of  N  with 
respect  to  this  basis  has  more  than  half 
of  its  entries  equal  to  0.  Later  in  this 
chapter  we  will  do  even  better. 


If  V  is  a  complex  vector  space ,  a 
proof  of  the  next  result  follows  eas¬ 
ily  from  Exercise  7,  5.27,  and  5.32. 
But  the  proof  given  here  uses  sim¬ 
pler  ideas  than  needed  to  prove 
5.27,  and  it  works  for  both  real  and 
complex  vector  spaces. 
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8.19  Matrix  of  a  nilpotent  operator 

Suppose  TV  is  a  nilpotent  operator  on  V.  Then  there  is  a  basis  of  V  with 
respect  to  which  the  matrix  of  N  has  the  form 


0 

• 

*  > 

• 

• 

1° 

o) 

here  all  entries  on  and  below  the  diagonal  are  0’s. 


Proof  First  choose  a  basis  of  null  N.  Then  extend  this  to  a  basis  of  null  N1 2. 
Then  extend  to  a  basis  of  null  N3.  Continue  in  this  fashion,  eventually  getting 
a  basis  of  V  (because  8.18  states  that  null  NdimV  =  V). 

Now  let’s  think  about  the  matrix  of  N  with  respect  to  this  basis.  The 
first  column,  and  perhaps  additional  columns  at  the  beginning,  consists  of 
all  0’s,  because  the  corresponding  basis  vectors  are  in  null  N.  The  next  set 
of  columns  comes  from  basis  vectors  in  null  N2.  Applying  N  to  any  such 
vector,  we  get  a  vector  in  null  N;  in  other  words,  we  get  a  vector  that  is  a 
linear  combination  of  the  previous  basis  vectors.  Thus  all  nonzero  entries  in 
these  columns  lie  above  the  diagonal.  The  next  set  of  columns  comes  from 
basis  vectors  in  null  N 3 .  Applying  N  to  any  such  vector,  we  get  a  vector  in 
null  N2\  in  other  words,  we  get  a  vector  that  is  a  linear  combination  of  the 
previous  basis  vectors.  Thus  once  again,  all  nonzero  entries  in  these  columns 
lie  above  the  diagonal.  Continue  in  this  fashion  to  complete  the  proof.  ■ 


EXERCISES  8. A 


1  Define  T  e  £(C2)  by 

T(w,z)  =  (z,  0). 

Find  all  generalized  eigenvectors  of  T. 

2  Define  T  e  £(C2)  by 

T(w,z)  =  (-z,  w). 

Find  the  generalized  eigenspaces  corresponding  to  the  distinct  eigenval¬ 
ues  of  T. 
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3  Suppose  T  G  C(V)  is  invertible.  Prove  that  G(X,  T)  =  G(j,  T~l )  for 
every  X  G  F  with  A  /  0. 

4  Suppose  T  G  £(L)  and  a,  /3  G  F  with  a  ft.  Prove  that 

G(a,  T )  n  G(/3,  T )  =  {0}. 

5  Suppose  T  G  C(V),  m  is  a  positive  integer,  and  v  G  F  is  such  that 
Tm~lv  7^  0  but  Tmv  =  0.  Prove  that 

v,  rv,  r2v,...,rm_1v 

is  linearly  independent. 

6  Suppose  T  G  £(C3)  is  defined  by  T(z\,  Z2,  z^,)  =  (z2,Z3,0).  Prove 
that  T  has  no  square  root.  More  precisely,  prove  that  there  does  not  exist 
S  G  £(C3)  such  that  S 2  =  T. 

7  Suppose  N  G  C(V)  is  nilpotent.  Prove  that  0  is  the  only  eigenvalue 
of  N. 

8  Prove  or  give  a  counterexample:  The  set  of  nilpotent  operators  on  V  is  a 
subspace  of  C(V). 

9  Suppose  S,T  G  C(V)  and  ST  is  nilpotent.  Prove  that  TS  is  nilpotent. 

10  Suppose  that  T  g  C(V)  is  not  nilpotent.  Let  n  —  dim  V.  Show  that 
V  =  null  Tn~l  ©  range  Tn~l. 

11  Prove  or  give  a  counterexample:  If  V  is  a  complex  vector  space  and 
dim  V  —  n  and  T  G  £(L),  then  Tn  is  diagonalizable. 

12  Suppose  N  G  C(V)  and  there  exists  a  basis  of  V  with  respect  to  which 
N  has  an  upper- triangular  matrix  with  only  0’s  on  the  diagonal.  Prove 
that  N  is  nilpotent. 

13  Suppose  V  is  an  inner  product  space  and  N  G  C(V)  is  normal  and 
nilpotent.  Prove  that  =  0. 

14  Suppose  V  is  an  inner  product  space  and  N  G  C(V)  is  nilpotent.  Prove 
that  there  exists  an  orthonormal  basis  of  V  with  respect  to  which  N  has 
an  upper-triangular  matrix. 

UfF  =  C ,  then  the  result  above  follows  from  Schur’s  Theorem  (6.38) 
without  the  hypothesis  that  N  is  nilpotent.  Thus  the  exercise  above  needs 
to  be  proved  only  when  F  =  R.] 
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15  Suppose  N  G  C(V)  is  such  that  null NdimV  1  ^  null  A^dim^.  Prove 
that  N  is  nilpotent  and  that 

dim  null  7  =  j 

for  every  integer  j  with  0  <  j  <  dim  V. 

16  Suppose  T  g  C(V).  Show  that 

V  =  range  T°  D  range  Tl  D  •  •  •  D  range  Tk  D  range  Tk+1  D  •  •  •  . 

17  Suppose  T  G  C(V)  and  m  is  a  nonnegative  integer  such  that 

range  Tm  =  range  Fm+1 . 

Prove  that  range  Tk  =  range  Tm  for  all  k  >  m. 

18  Suppose  T  G  £(F).  Let  n  =  dim  V.  Prove  that 

range  Tn  =  range  Tn+1  =  range  Tn+2  =  •  •  •  . 

19  Suppose  T  G  £(L)  and  m  is  a  nonnegative  integer.  Prove  that 

null  Tm  =  null  Tm+l  if  and  only  if  range  Tm  =  range  Fm+1 . 

20  Suppose  T  g  £(C5)  is  such  that  range  T 4  ^  range  T5 .  Prove  that  T  is 
nilpotent. 

21  Find  a  vector  space  W  and  T  G  such  that  null  Tk  c  null  7^+1 

and  range  Tk  range  7^+1  for  every  positive  integer  k. 
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Decomposition  of  an  Operator 

Description  of  Operators  on  Complex  Vector  Spaces 

We  saw  earlier  that  the  domain  of  an  operator  might  not  decompose  into 
eigenspaces,  even  on  a  finite-dimensional  complex  vector  space.  In  this 
section  we  will  see  that  every  operator  on  a  finite-dimensional  complex  vector 
space  has  enough  generalized  eigenvectors  to  provide  a  decomposition. 

We  observed  earlier  that  if  T  e  C(V),  then  null  T  and  range  T  are  invari¬ 
ant  under  T  [see  5.3,  parts  (c)  and  (d)].  Now  we  show  that  the  null  space  and 
the  range  of  each  polynomial  of  T  is  also  invariant  under  T. 

8.20  The  null  space  and  range  of  p(T)  are  invariant  under  T 

Suppose  T  G  C(V)  and  p  G  V(F).  Then  null  p(T)  and  range  p(T)  are 
invariant  under  T. 

Proof  Suppose  v  G  null  p{T).  Then  p(T)v  =  0.  Thus 

{(p(T))(Tv)  =  T(p(T)v)  =  7X0)  =  0. 

Hence  Tv  G  null  p(T).  Thus  null  p(T)  is  invariant  under  T,  as  desired. 

Suppose  v  G  range  p(T).  Then  there  exists  u  G  V  such  that  v  =  p(T)u. 
Thus 

Tv  =  T(p(T)u )  =  p(T)(Tu). 

Hence  Tv  G  range  p(T).  Thus  range  p(T)  is  invariant  under  T,  as  desired.  ■ 

The  following  major  result  shows  that  every  operator  on  a  complex  vector 
space  can  be  thought  of  as  composed  of  pieces,  each  of  which  is  a  nilpotent 
operator  plus  a  scalar  multiple  of  the  identity.  Actually  we  have  already  done 
the  hard  work  in  our  discussion  of  the  generalized  eigenspaces  G(A,  T),  so  at 
this  point  the  proof  is  easy. 

8.21  Description  of  operators  on  complex  vector  spaces 

Suppose  V  is  a  complex  vector  space  and  T  g  jC(V).  Let  X\ , . . . ,  Xm  be 
the  distinct  eigenvalues  of  T.  Then 

(a)  V  =  G(Ai,  T)  ©  •  •  •  ©  G(Am,  T)\ 

(b)  each  G{Xj ,  T)  is  invariant  under  T ; 

(c)  each  (T  —  Xj I)\g{Xj,T)  is  nilpotent. 


8.B 
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Proof  Let  n  =  dim  V.  Recall  that  G(A7 ,  T )  =  null(r  —  Xj  I)n  for  each  j 
(by  8.11).  From  8.20  [with  p(z )  =  (z  —  A j)n],  we  get  (b).  Obviously  (c) 
follows  from  the  definitions. 

We  will  prove  (a)  by  induction  on  n.  To  get  started,  note  that  the  desired 
result  holds  if  n  =  1.  Thus  we  can  assume  that  n  >  1  and  that  the  desired 
result  holds  on  all  vector  spaces  of  smaller  dimension. 

Because  V  is  a  complex  vector  space,  T  has  an  eigenvalue  (see  5.21);  thus 
m  >  1.  Applying  8.5  to  T  —  X\I  shows  that 

8.22  V  =  G(Ai,  T)  ©  U, 

where  U  —  rang e(T  —  X\I)n.  Using  8.20  [with  p(z )  =  (z  —  Ai)w],  we  see 
that  U  is  invariant  under  T.  Because  G(Ai,  T)  ^  {0},  we  have  dim  U  <  n. 
Thus  we  can  apply  our  induction  hypothesis  to  T  \  jj. 

None  of  the  generalized  eigenvectors  of  T  \  u  correspond  to  the  eigenvalue 
Ai,  because  all  generalized  eigenvectors  of  T  corresponding  to  X\  are  in 
G(Ai,  T).  Thus  each  eigenvalue  of  T\u  is  in  {A2, . . . ,  Xm}. 

By  our  induction  hypothesis,  U  =  G(A2,  T\u)  ©  •••  ©  G( Xm,  T\u). 
Combining  this  information  with  8.22  will  complete  the  proof  if  we  can  show 
that  G(A^,  T\jj)  =  G(A^,  T)  for  k  =  2, . . . ,  m. 

Thus  fix  k  G  {2, ... ,  m).  The  inclusion  G(A^,  T\jj)  C  G( A^,  T )  is  clear. 
To  prove  the  inclusion  in  the  other  direction,  suppose  v  e  G( A^,  T).  By 
8.22,  we  can  write  v  =  vi  +  u,  where  v\  G  G(Ai,  T )  and  u  e  U.  Our 
induction  hypothesis  implies  that 

u  =  V2  +  •  •  •  +  vm , 

where  each  Vj  is  in  G(Ay,  T\u),  which  is  a  subset  of  G(Ay,  T).  Thus 


V  =  Vi  +  V2  H - h  vm, 


Because  generalized  eigenvectors  corresponding  to  distinct  eigenvalues  are 
linearly  independent  (see  8.13),  the  equation  above  implies  that  each  Vj  equals 
0  except  possibly  when  /  —  k.  In  particular,  v\  =  0  and  thus  v  =  u  G  U. 
Because  v  G  U,  we  can  conclude  that  v  G  G(A^,  T\u),  completing  the 
proof.  ■ 

As  we  know,  an  operator  on  a  complex  vector  space  may  not  have  enough 
eigenvectors  to  form  a  basis  of  the  domain.  The  next  result  shows  that  on  a 
complex  vector  space  there  are  enough  generalized  eigenvectors  to  do  this. 
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8.23  A  basis  of  generalized  eigenvectors 

Suppose  V  is  a  complex  vector  space  and  T  e  C(V ).  Then  there  is  a  basis 
of  V  consisting  of  generalized  eigenvectors  of  T. 

Proof  Choose  a  basis  of  each  G(Ay ,  T)  in  8.21.  Put  all  these  bases  together 
to  form  a  basis  of  V  consisting  of  generalized  eigenvectors  of  T.  m 

Multiplicity  of  an  Eigenvalue 

If  V  is  a  complex  vector  space  and  T  e  C(V),  then  the  decomposition  of  V 
provided  by  8.21  can  be  a  powerful  tool.  The  dimensions  of  the  subspaces 
involved  in  this  decomposition  are  sufficiently  important  to  get  a  name. 

8.24  Definition  multiplicity 

•  Suppose  T  e  C(V).  The  multiplicity  of  an  eigenvalue  A  of  T 
is  defined  to  be  the  dimension  of  the  corresponding  generalized 
eigenspace  G(A,  T). 

•  In  other  words,  the  multiplicity  of  an  eigenvalue  A  of  T  equals 
dimnull(T  -A/)dimV 

The  second  bullet  point  above  is  justified  by  8. 1 1 . 

8.25  Example  Suppose  T  e  C{ C3)  is  defined  by 

7t(zi,z2,z3)  =  (6zx  +  3z2  +  4z3,  6z2  +  2z3,7z3). 

The  matrix  of  T  (with  respect  to  the  standard  basis)  is 


The  eigenvalues  of  T  are  6  and  7,  as  follows  from  5.32.  You  can  verify  that 
the  generalized  eigenspaces  of  T  are  as  follows: 


G(6,  T)  =  span((l,  0,0),  (0,1,0))  and  G(7,  T)  =  span((10,  2, 1)). 


Thus  the  eigenvalue  6  has  multiplicity  2  and  the  eigenvalue  7  has  multiplicity  1. 

The  direct  sum  C3  =  G(6,  T)  ©  G(7,  T )  is  the  decomposition  promised  by 
8.21.  A  basis  of  C3  consisting  of  generalized  eigenvectors  of  7",  as  promised 
by  8.23,  is 


(1,0,0),  (0,1,0),  (10,  2,1). 
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In  Example  8.25,  the  sum  of  the  multiplicities  of  the  eigenvalues  of  T 
equals  3,  which  is  the  dimension  of  the  domain  of  T.  The  next  result  shows 
that  this  always  happens  on  a  complex  vector  space. 

8.26  Sum  of  the  multiplicities  equals  dim  V 

Suppose  V  is  a  complex  vector  space  and  T  e  jC(V).  Then  the  sum  of  the 
multiplicities  of  all  the  eigenvalues  of  T  equals  dim  V. 

Proof  The  desired  result  follows  from  8.21  and  the  obvious  formula  for  the 
dimension  of  a  direct  sum  (see  3.78  or  Exercise  16  in  Section  2.C).  ■ 

The  terms  algebraic  multiplicity  and  geometric  multiplicity  are  used  in 
some  books.  In  case  you  encounter  this  terminology,  be  aware  that  the 
algebraic  multiplicity  is  the  same  as  the  multiplicity  defined  here  and  the 
geometric  multiplicity  is  the  dimension  of  the  corresponding  eigenspace.  In 
other  words,  if  T  e  C(V)  and  A  is  an  eigenvalue  of  T,  then 

algebraic  multiplicity  of  A  =  dim  null (T  —  XI)dimV  =  dim  G(A,  T), 
geometric  multiplicity  of  A  =  dim  null (T  —  XI)  =  dimi?(A,  T ). 

Note  that  as  defined  above,  the  algebraic  multiplicity  also  has  a  geometric 
meaning  as  the  dimension  of  a  certain  null  space.  The  definition  of  multiplicity 
given  here  is  cleaner  than  the  traditional  definition  that  involves  determinants; 
10.25  implies  that  these  definitions  are  equivalent. 


Block  Diagonal  Matrices 


To  interpret  our  results  in  matrix  form, 
we  make  the  following  definition,  gener¬ 
alizing  the  notion  of  a  diagonal  matrix. 

If  each  matrix  A  j  in  the  definition 
below  is  a  1-by-l  matrix,  then  we  actually  have  a  diagonal  matrix. 


Often  we  can  understand  a  matrix 
better  by  thinking  of  it  as  composed 
of  smaller  matrices. 


8.27  Definition  block  diagonal  matrix 

A  block  diagonal  matrix  is  a  square  matrix  of  the  form 

A1  0  \ 

.  5 

\  0  Am  J 

where  A i, . . . ,  Am  are  square  matrices  lying  along  the  diagonal  and  all 
the  other  entries  of  the  matrix  equal  0. 
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8.28  Example  The  5-by-5  matrix 

(  4  )  0  0 


A  = 


0 

0 

0 


2  -3 
0  2 
0  0 


\  0  0  0 
is  a  block  diagonal  matrix  with 


A  = 


A 


0 


where 


A1  =  (4),  A2  = 


A 


0  0 

0  0 
0  0 


0  \ 


A3  7 


A'x  — 


\ 


7 


Here  the  inner  matrices  in  the  5-by-5  matrix  above  are  blocked  off  to  show 
how  we  can  think  of  it  as  a  block  diagonal  matrix. 


Note  that  in  the  next  result  we  get  many  more  zeros  in  the  matrix  of  T 
than  are  needed  to  make  it  upper  triangular. 


8.29  Block  diagonal  matrix  with  upper-triangular  blocks 

Suppose  V  is  a  complex  vector  space  and  T  e  C(V).  Let  X\ , . . . ,  Xm  be 
the  distinct  eigenvalues  of  T,  with  multiplicities  d\, ...  ,dm.  Then  there  is 
a  basis  of  V  with  respect  to  which  T  has  a  block  diagonal  matrix  of  the 
form 

Ai  0  \ 

.  ? 

\  0  Am  J 

where  each  Aj  is  a  dj-by-dj  upper-triangular  matrix  of  the  form 

(  Xj  *  \ 

Aj  = 

0  Xj  7 
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Proof  Each  ( T  —  X jI)\G(Xj,T )  is  nilpotent  [see  8.21(c)].  For  each  j ,  choose 
a  basis  of  G(Ay,  7"),  which  is  a  vector  space  with  dimension  dj ,  such  that  the 
matrix  of  (7  —  X jI)\G(Xj,T)  with  respect  to  this  basis  is  as  in  8.19.  Thus  the 
matrix  of  T\G{Xj,T),  which  equals  ( T  -X jI)\G^j,T)  +  XjI |G(Ay,7>  with 
respect  to  this  basis  will  look  like  the  desired  form  shown  above  for  A  j . 

Putting  the  bases  of  the  G(Xj ,  T)’s  together  gives  a  basis  of  V  [by  8.21(a)]. 
The  matrix  of  T  with  respect  to  this  basis  has  the  desired  form.  ■ 

The  5-by-5  matrix  in  8.28  is  of  the  form  promised  by  8.29,  with  each  of 
the  blocks  itself  an  upper-triangular  matrix  that  is  constant  along  the  diagonal 
of  the  block.  If  T  is  an  operator  on  a  5-dimensional  vector  space  whose  matrix 
is  as  in  8.28,  then  the  eigenvalues  of  T  are  4,  2, 1  (as  follows  from  5.32),  with 
multiplicities  1,  2,  2. 

8.30  Example  Suppose  T  e  £(C3)  is  defined  by 

r(zi,z2,z3)  =  (6zx  +  3z2  +  4z3,  6z2  +  2z3,7z3). 

The  matrix  of  T  (with  respect  to  the  standard  basis)  is 


which  is  an  upper-triangular  matrix  but  is  not  of  the  form  promised  by  8.29. 

As  we  saw  in  Example  8.25,  the  eigenvalues  of  T  are  6  and  7  and  the 
corresponding  generalized  eigenspaces  are 


G(6,  T)  =  span((l,  0, 0),  (0, 1,0))  and  G( 7,  T)  =  span((10, 2, 1)). 
We  also  saw  that  a  basis  of  C3  consisting  of  generalized  eigenvectors  of  T  is 

(1,0, 0),  (0,1,0),  (10, 2,1). 


The  matrix  of  T  with  respect  to  this  basis  is 


which  is  a  matrix  of  the  block  diagonal  form  promised  by  8.29. 

When  we  discuss  the  Jordan  Form  in  Section  8.D,  we  will  see  that  we  can 
find  a  basis  with  respect  to  which  an  operator  T  has  a  matrix  with  even  more 
0’s  than  promised  by  8.29.  However,  8.29  and  its  equivalent  companion  8.21 
are  already  quite  powerful.  For  example,  in  the  next  subsection  we  will  use 
8.21  to  show  that  every  invertible  operator  on  a  complex  vector  space  has  a 
square  root. 
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Square  Roots 

Recall  that  a  square  root  of  an  operator  T  e  C(V)  is  an  operator  R  e  C(V) 
such  that  R2  =  T  (see  7.33).  Every  complex  number  has  a  square  root,  but 
not  every  operator  on  a  complex  vector  space  has  a  square  root.  For  example, 
the  operator  on  C3  in  Exercise  6  in  Section  8. A  has  no  square  root.  The 
noninvertibility  of  that  operator  is  no  accident,  as  we  will  soon  see.  We  begin 
by  showing  that  the  identity  plus  any  nilpotent  operator  has  a  square  root. 

8.31  Identity  plus  nilpotent  has  a  square  root 

Suppose  N  G  C(V)  is  nilpotent.  Then  I  +  N  has  a  square  root. 

Proof  Consider  the  Taylor  series  for  the  function  y/l  +  x: 

8.32  VI  X  —  1  T"  d\X  ~b  &2X 2  T  *  *  *  • 

We  will  not  find  an  explicit  formula 
for  the  coefficients  or  worry  about 
whether  the  infinite  sum  converges  be¬ 
cause  we  will  use  this  equation  only  as 
motivation. 

Because  N  is  nilpotent,  Nm  =  0  for  some  positive  integer  m.  In  8.32, 
suppose  we  replace  x  with  N  and  1  with  I.  Then  the  infinite  sum  on  the  right 
side  becomes  a  finite  sum  (because  NJ'  =  0  for  all  j  >  m).  In  other  words, 
we  guess  that  there  is  a  square  root  of  I  +  N  of  the  form 

I  +  a\  N  +  d2N2  +  •  •  •  +  dm—\Nm  * . 

Having  made  this  guess,  we  can  try  to  choose  a\ ,  d2, . . . ,  am- \  such  that  the 
operator  above  has  its  square  equal  to  I  +  N.  Now 

(/  -\-d\ N  +  CL2N^  +  $3  N3  +  •  •  •  +  CLm— 1  Nm  l)2 

—  I  2d\N  ~b  (2d2  +  d\2)N2  ~b  (2^3  2d\d2) N2  H-  •  •  • 

+  (! 2dm-\  +  terms  involving  a\, . . .  ,dm-2)Nm~l. 

We  want  the  right  side  of  the  equation  above  to  equal  I  +  N.  Hence  choose  d\ 
such  that  2d\  =  1  (thus  d \  =  1/2).  Next,  choose  d2  such  that  2d2  +  d\2  =  0 
(thus  d2  =  —1/8).  Then  choose  d 3  such  that  the  coefficient  of  N3  on  the 
right  side  of  the  equation  above  equals  0  (thus  d  3  =  1/16).  Continue  in 
this  fashion  for  j  =  4, . . . ,  m  —  1,  at  each  step  solving  for  dj  so  that  the 
coefficient  of  NJ‘  on  the  right  side  of  the  equation  above  equals  0.  Actually 
we  do  not  care  about  the  explicit  formula  for  the  dj' s.  We  need  only  know 
that  some  choice  of  the  dj ’s  gives  a  square  root  of  I  +  N.  m 


Because  a\  —  1/2,  the  formula 
above  shows  that  1  +  x/2  is  a 
good  estimate  for  VT  +  x  when  x 
is  small 
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The  previous  lemma  is  valid  on  real  and  complex  vector  spaces.  However, 
the  next  result  holds  only  on  complex  vector  spaces.  For  example,  the  operator 
of  multiplication  by  —1  on  the  1 -dimensional  real  vector  space  R  has  no  square 
root. 

8.33  Over  C,  invertible  operators  have  square  roots 

Suppose  V  is  a  complex  vector  space  and  T  e  C{V)  is  invertible.  Then 
T  has  a  square  root. 

Proof  Let  A  \ , . . . ,  Xm  be  the  distinct  eigenvalues  of  T.  For  each  j ,  there  ex¬ 
ists  a  nilpotent  operator  Nj  e  £(G(Ay,  T ))  such  that  T\q^.  ,t)  =  A  /  /  +  Nj 
[see  8.21(c)].  Because  T  is  invertible,  none  of  the  Ay’s  equals  0,  so  we  can 
write 

T\ov.,.T)  =  h{l  +  Jj) 

for  each  j .  Clearly  Nj /Ay  is  nilpotent,  and  so  I  +  Nj /Ay  has  a  square  root 
(by  8.31).  Multiplying  a  square  root  of  the  complex  number  Ay  by  a  square 
root  of  I  +  Nj /Ay,  we  obtain  a  square  root  Rj  of  T'|g(A7-,7> 

A  typical  vector  v  e  V  can  be  written  uniquely  in  the  form 

v  —  u  !+•••  +  um, 

where  each  U  j  is  in  G(Ay ,  T)  (see  8.21).  Using  this  decomposition,  define  an 
operator  R  e  C(V)  by 


Rv  —  R\U\  +  •  •  •  +  Rmum. 

You  should  verify  that  this  operator  R  is  a  square  root  of  T,  completing  the 
proof.  ■ 

By  imitating  the  techniques  in  this  section,  you  should  be  able  to  prove 
that  if  V  is  a  complex  vector  space  and  T  e  C(V)  is  invertible,  then  T  has  a 
kth  root  for  every  positive  integer  k. 

EXERCISES  8.B 


1  Suppose  V  is  a  complex  vector  space,  N  e  £(U),  and  0  is  the  only 
eigenvalue  of  N.  Prove  that  N  is  nilpotent. 

2  Give  an  example  of  an  operator  T  on  a  finite-dimensional  real  vector 
space  such  that  0  is  the  only  eigenvalue  of  T  but  T  is  not  nilpotent. 
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3  Suppose  T  g  £(L).  Suppose  S  G  C(V )  is  invertible.  Prove  that  T  and 
S~lTS  have  the  same  eigenvalues  with  the  same  multiplicities. 

4  Suppose  V  is  an  n -dimensional  complex  vector  space  and  T  is  an  oper¬ 
ator  on  V  such  that  null  Tn~ 2  ^  null  Tn~l .  Prove  that  T  has  at  most 
two  distinct  eigenvalues. 

5  Suppose  V  is  a  complex  vector  space  and  T  g  C(V).  Prove  that  V  has 
a  basis  consisting  of  eigenvectors  of  T  if  and  only  if  every  generalized 
eigenvector  of  T  is  an  eigenvector  of  T. 

[For  F  =  C,  the  exercise  above  adds  an  equivalence  to  the  list  in  5.41.] 

6  Define  N  G  £(F5)  by 

N(X\,  X2,  X3,  X4,  Xs)  =  (2X2,  3X3, —X4,  4X5,  0). 

Find  a  square  root  of  I  +  N. 

7  Suppose  V  is  a  complex  vector  space.  Prove  that  every  invertible  operator 
on  V  has  a  cube  root. 


8  Suppose  T  G  C{V)  and  3  and  8  are  eigenvalues  of  T.  Let  n  =  dim  V. 
Prove  that  V  =  (null  Tn~ 2)  ®  (range  Tn~ 2). 


9 


Suppose  A  and  B  are  block  diagonal  matrices  of  the  form 


(  Bl 

B  = 

V  0 


0 

B\n  J 


where  A  j  has  the  same  size  as  By  for  j  =  1 , ...  ,m.  Show  that  AB  is  a 
block  diagonal  matrix  of  the  form 


/  A1B1  0  \ 

AB  = 

\  0  /f  m  Bm  J 


10  Suppose  F  =  C  and  T  e  C{V).  Prove  that  there  exist  D,  N  G  C(V) 
such  that  T  =  D  +  N,  the  operator  D  is  diagonalizable,  N  is  nilpotent, 
and  DN  =  ND. 

11  Suppose  T  G  C(V)  and  X  G  F.  Prove  that  for  every  basis  of  V  with 
respect  to  which  T  has  an  upper-triangular  matrix,  the  number  of  times 
that  X  appears  on  the  diagonal  of  the  matrix  of  T  equals  the  multiplicity 
of  X  as  an  eigenvalue  of  T. 
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Characteristic  and  Minimal  Polynomials 

The  Cayley-Hamilton  Theorem 

The  next  definition  associates  a  polynomial  with  each  operator  on  V  if  F  =  C. 
For  F  =  R,  the  corresponding  definition  will  be  given  in  the  next  chapter. 

8.34  Definition  characteristic  polynomial 

Suppose  V  is  a  complex  vector  space  and  T  e  C{V).  Let  Ai, . . . ,  Xm 
denote  the  distinct  eigenvalues  of  T,  with  multiplicities  d\ , . . . ,  dm.  The 
polynomial 

(z-XlC---(z-Xm)d'n 

is  called  the  characteristic  polynomial  of  T. 


8.35  Example  Suppose  T  e  £(C3)  is  defined  as  in  Example  8.25.  Be¬ 
cause  the  eigenvalues  of  T  are  6,  with  multiplicity  2,  and  7,  with  multiplicity  1, 
we  see  that  the  characteristic  polynomial  of  T  is  (z  —  6 )2(z  —  7). 


8.C 


8.36  Degree  and  zeros  of  characteristic  polynomial 
Suppose  V  is  a  complex  vector  space  and  T  e  C(V).  Then 

(a)  the  characteristic  polynomial  of  T  has  degree  dim  V ; 

(b)  the  zeros  of  the  characteristic  polynomial  of  T  are  the  eigenvalues 
of  T. 

Proof  Clearly  part  (a)  follows  from  8.26  and  part  (b)  follows  from  the  defini¬ 
tion  of  the  characteristic  polynomial.  ■ 

Most  texts  define  the  characteristic  polynomial  using  determinants  (the 
two  definitions  are  equivalent  by  10.25).  The  approach  taken  here,  which 
is  considerably  simpler,  leads  to  the  following  easy  proof  of  the  Cayley- 
Hamilton  Theorem.  In  the  next  chapter,  we  will  see  that  this  result  also  holds 
on  real  vector  spaces  (see  9.24). 

8.37  Cayley-Hamilton  Theorem 

Suppose  V  is  a  complex  vector  space  and  T  e  C{V).  Let  q  denote  the 
characteristic  polynomial  of  T.  Then  q(T)  =  0. 
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Proof  Let  X\, ...  ,Xm  be  the  distinct 
eigenvalues  of  the  operator  T,  and  let 
d\, ...  ,dm  be  the  dimensions  of  the 
corresponding  generalized  eigenspaces 
G(X\,  T), . . . ,  G(Xm,  T).  For  each 
j  G  {1, . . . ,  m},  we  know  that 
(T  —  Xj I)\g(Xj,T)  is  nilpotent.  Thus 

we  have  (T  -  X jI)dJ \g{Xj,t)  =  0  (by 
8.18). 

Every  vector  in  V  is  a  sum  of  vectors  in  G(X  i,  T), ... ,  G(Xm,  T )  (by  8.21). 
Thus  to  prove  that  q(T )  =  0,  we  need  only  show  that  q{T)\Q^.  7^  =  0  for 
each  j . 

Thus  fix  j  G  {1, . . . ,  m).  We  have 

q(T)  =  (T-Xlr)d'---(T-XmI)dm. 

The  operators  on  the  right  side  of  the  equation  above  all  commute,  so  we  can 
move  the  factor  (T  —  Xj  I)dj  to  be  the  last  term  in  the  expression  on  the  right. 
Because  (T  -  X  jI)dJ \g(Xj,t)  =  0,  we  conclude  that  q(T)\G(Xj,T)  =  0,  as 
desired.  ■ 

The  Minimal  Polynomial 

In  this  subsection  we  introduce  another  important  polynomial  associated  with 
each  operator.  We  begin  with  the  following  definition. 

8.38  Definition  monic  polynomial 

A  monic  polynomial  is  a  polynomial  whose  highest-degree  coefficient 
equals  1. 


English  mathematician  Arthur 
Cayley  ( 1821-1895 )  published 
three  math  papers  before  complet¬ 
ing  his  undergraduate  degree  in 
1 842.  Irish  mathematician  William 
Rowan  Hamilton  ( 1805-1865 )  was 
made  a  professor  in  1827  when 
he  was  22  years  old  and  still  an 
undergraduate! 


8.39  Example  The  polynomial  2  +  9 z2  +  z7  is  a  monic  polynomial  of 
degree  7. 


8.40  Minimal  polynomial 

Suppose  T  G  jC(V).  Then  there  is  a  unique  monic  polynomial  p  of 
smallest  degree  such  that  p(T)  =  0. 
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Proof  Let  n  =  dim  V.  Then  the  list 


is  not  linearly  independent  in  C(V ),  because  the  vector  space  C(V)  has 
dimension  n 2  (see  3.61)  and  we  have  a  list  of  length  n2  +  1.  Let  m  be  the 
smallest  positive  integer  such  that  the  list 

8.41  I,  T,T2,...,  Tm 

is  linearly  dependent.  The  Linear  Dependence  Lemma  (2.21)  implies  that  one 
of  the  operators  in  the  list  above  is  a  linear  combination  of  the  previous  ones. 
Because  m  was  chosen  to  be  the  smallest  positive  integer  such  that  the  list 
above  is  linearly  dependent,  we  conclude  that  Tm  is  a  linear  combination  of 
/,  T ,  T2, . . . ,  Tm~l .  Thus  there  exist  scalars  ao,ai,a2, . . . ,  am- \  G  F  such 
that 

8.42  a0I  +aiT  +  a2T2  -| - b  +  Tm  =  0. 

Define  a  monic  polynomial  p  g  V(F)  by 

p(z)  =  ao  +  a\z  +  a2z 2  H - h  am-\zm~x  +  zm . 

Then  8.42  implies  that  p(T )  =  0. 

To  prove  the  uniqueness  part  of  the  result,  note  that  the  choice  of  m  implies 
that  no  monic  polynomial  q  e  V( F)  with  degree  smaller  than  m  can  satisfy 
q(T)  =  0.  Suppose  q  e  V(F)  is  a  monic  polynomial  with  degree  m  and 
^r(T)  =  0.  Then  ( p  —  q)(T)  =  0  and  deg (p  —  q)  <  m.  The  choice  of  m  now 
implies  that  q  =  p,  completing  the  proof.  ■ 

The  last  result  justifies  the  following  definition. 

8.43  Definition  minimal  polynomial 

Suppose  T  G  jC(V).  Then  the  minimal  polynomial  of  T  is  the  unique 
monic  polynomial  p  of  smallest  degree  such  that  p{T )  =  0. 

The  proof  of  the  last  result  shows  that  the  degree  of  the  minimal  polynomial 
of  each  operator  on  V  is  at  most  (dim  V )2.  The  Cayley-Hamilton  Theorem 
(8.37)  tells  us  that  if  V  is  a  complex  vector  space,  then  the  minimal  polynomial 
of  each  operator  on  V  has  degree  at  most  dim  V.  This  remarkable  improvement 
also  holds  on  real  vector  spaces,  as  we  will  see  in  the  next  chapter. 
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Suppose  you  are  given  the  matrix  (with  respect  to  some  basis)  of  an 
operator  T  G  C{V).  You  could  program  a  computer  to  find  the  minimal 
polynomial  of  T  as  follows:  Consider  the  system  of  linear  equations 

8.44  a0M(I)  +  aiM(T)  +  •  •  •  +  am-X  M{T)m~l  =  —M(T)m 

for  successive  values  of  m  =  1,2,... 
until  this  system  of  equations  has  a  solu¬ 
tion  cio,  a\ ,  ci2,  •  •  •  ,dm- 1.  The  scalars 
d$,d\,d2, . . .  ,dm- 1,  1  will  then  be  the 
coefficients  of  the  minimal  polynomial  of  T.  All  this  can  be  computed  using  a 
familiar  and  fast  (for  a  computer)  process  such  as  Gaussian  elimination. 

8.45  Example  Let  T  be  the  operator  on  C5  whose  matrix  (with  respect 
to  the  standard  basis)  is 

/  0  0  0  0  -3  \ 

1  0  0  0  6 

0  10  0  0 
0  0  10  0 

V  o  o  o  i  o  / 

Find  the  minimal  polynomial  of  T. 

Solution  Because  of  the  large  number  of  0’s  in  this  matrix,  Gaussian  elim¬ 
ination  is  not  needed  here.  Simply  compute  powers  of  M(T ),  and  then 
you  will  notice  that  there  is  clearly  no  solution  to  8.44  until  m  =  5.  Do 
the  computations  and  you  will  see  that  the  minimal  polynomial  of  T  equals 
z5  —  6z  +  3. 


Think  of  this  as  a  system  of 
(dim  V)2  linear  equations  in  m 
variables  ao,  a\ , . . . ,  am-\. 


The  next  result  completely  characterizes  the  polynomials  that  when  applied 
to  an  operator  give  the  0  operator. 

8.46  q(T )  =  0  implies  q  is  a  multiple  of  the  minimal  polynomial 

Suppose  T  G  C(V)  and  q  G  'P(F).  Then  q(T)  =  0  if  and  only  if  q  is  a 
polynomial  multiple  of  the  minimal  polynomial  of  T. 

Proof  Let  p  denote  the  minimal  polynomial  of  T. 

First  we  prove  the  easy  direction.  Suppose  q  is  a  polynomial  multiple  of  p. 
Thus  there  exists  a  polynomial  s  G  V(F)  such  that  q  =  ps.  We  have 

V(T)  =  p(T)s(T)  =  0s(T)  =  0, 


as  desired. 
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To  prove  the  other  direction,  now  suppose  q(T)  =  0.  By  the  Division 
Algorithm  for  Polynomials  (4.8),  there  exist  polynomials  s,r  e  V(F)  such 
that 

8.47  q  =  ps  +  r 

and  deg  r  <  deg  p.  We  have 

0  =  q(T)  =  p(T)s(T)  +  r(T)  =  r(T). 

The  equation  above  implies  that  r  =  0  (otherwise,  dividing  r  by  its  highest- 
degree  coefficient  would  produce  a  monic  polynomial  that  when  applied  to 
T  gives  0;  this  polynomial  would  have  a  smaller  degree  than  the  minimal 
polynomial,  which  would  be  a  contradiction).  Thus  8.47  becomes  the  equation 
q  =  ps.  Hence  q  is  a  polynomial  multiple  of  p,  as  desired.  ■ 

The  next  result  is  stated  only  for  complex  vector  spaces,  because  we  have 
not  yet  defined  the  characteristic  polynomial  when  F  =  R.  However,  the 
result  also  holds  for  real  vector  spaces,  as  we  will  see  in  the  next  chapter. 

8.48  Characteristic  polynomial  is  a  multiple  of  minimal  polynomial 

Suppose  F  =  C  and  T  e  C{V).  Then  the  characteristic  polynomial  of  T 
is  a  polynomial  multiple  of  the  minimal  polynomial  of  T. 

Proof  The  desired  result  follows  immediately  from  the  Cayley-Hamilton 
Theorem  (8.37)  and  8.46.  ■ 

We  know  (at  least  when  F  =  C)  that  the  zeros  of  the  characteristic 
polynomial  of  T  are  the  eigenvalues  of  T  (see  8.36).  Now  we  show  that  the 
minimal  polynomial  has  the  same  zeros  (although  the  multiplicities  of  these 
zeros  may  differ). 

8.49  Eigenvalues  are  the  zeros  of  the  minimal  polynomial 

Let  T  e  £(L).  Then  the  zeros  of  the  minimal  polynomial  of  T  are 
precisely  the  eigenvalues  of  T. 

Proof  Let 


p(z)  =  ao  +  a\z  +  a2z 2  H - b  am-izm  1  +  zm 


be  the  minimal  polynomial  of  T. 
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First  suppose  X  e  F  is  a  zero  of  p.  Then  p  can  be  written  in  the  form 

p(z)  =  (z-X)q(z), 

where  q  is  a  monic  polynomial  with  coefficients  in  F  (see  4.11).  Because 
p(T)  =  0,  we  have 

0  =  (T  —  kI)(q(T)v) 

for  all  v  G  V.  Because  the  degree  of  q  is  less  than  the  degree  of  the  minimal 
polynomial  p ,  there  exists  at  least  one  vector  v  e  V  such  that  q(T)v  ^  0. 
The  equation  above  thus  implies  that  X  is  an  eigenvalue  of  T,  as  desired. 

To  prove  the  other  direction,  now  suppose  X  e  F  is  an  eigenvalue  of  T. 
Thus  there  exists  v  e  V  with  v/0  such  that  Tv  =  Xv.  Repeated  applications 
of  T  to  both  sides  of  this  equation  show  that  TJ  v  =  A7  v  for  every  nonnegative 
integer  j .  Thus 

0  =  p(T)v  =  (a0I  +aiT +  a2T2 +  ---  +  am-iTm~l  +Tm)v 

—  ( Uq  d\X  “l-  @2 *  *  ■  “l-  dm — 1  “I-  xm)v 

=  p{X)v. 

Because  v^O,  the  equation  above  implies  that  p(X)  =  0,  as  desired.  ■ 

The  next  three  examples  show  how  our  results  can  be  useful  in  finding 
minimal  polynomials  and  in  understanding  why  eigenvalues  of  some  operators 
cannot  be  exactly  computed. 

8.50  Example  Find  the  minimal  polynomial  of  the  operator  T  e  £(C3) 
in  Example  8.30. 

Solution  In  Example  8.30  we  noted  that  the  eigenvalues  of  T  are  6  and  7. 
Thus  by  8.49,  the  minimal  polynomial  of  T  is  a  polynomial  multiple  of 
(z  -  6 )(z  -  7). 

In  Example  8.35,  we  saw  that  the  characteristic  polynomial  of  T  is 
(z  —  6)2(z  —  7).  Thus  by  8.48  and  the  paragraph  above,  the  minimal  polyno¬ 
mial  of  T  is  either  (z  —  6)(z  —  7)  or  (z  —  6)2(z  —  7).  A  simple  computation 
shows  that 

(T  -6I)(T  -II)  ^  0. 

Thus  the  minimal  polynomial  of  T  is  (z  —  6)2(z  —  7). 
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8.51  Example  Find  the  minimal  polynomial  of  the  operator  T  G  £(C3) 
defined  by  T (z \ ,  Z2, Z3)  =  (6zi ,  6Z2,  lz^). 

Solution  It  is  easy  to  see  that  for  this  operator  T,  the  eigenvalues  of  T  are  6 
and  7,  and  the  characteristic  polynomial  of  T  is  (z  —  6)2(z  —  7). 

Thus  as  in  the  previous  example,  the  minimal  polynomial  of  T  is  ei¬ 
ther  (z  —  6)(z  —  7)  or  (z  —  6)2(z  —  7).  A  simple  computation  shows  that 
(T  —  6 1)(T  —  II)  =  0.  Thus  the  minimal  polynomial  of  T  is  (z  —  6)(z  —  7). 


8.52  Example  What  are  the  eigenvalues  of  the  operator  in  Example  8.45? 

Solution  From  8.49  and  the  solution  to  Example  8.45,  we  see  that  the 
eigenvalues  of  T  equal  the  solutions  to  the  equation 

z5  —  6z  +  3  =  0. 

Unfortunately,  no  solution  to  this  equation  can  be  computed  using  rational 
numbers,  roots  of  rational  numbers,  and  the  usual  rules  of  arithmetic  (a  proof 
of  this  would  take  us  considerably  beyond  linear  algebra).  Thus  we  cannot  find 
an  exact  expression  for  any  eigenvalue  of  T  in  any  familiar  form,  although 
numeric  techniques  can  give  good  approximations  for  the  eigenvalues  of 
T.  The  numeric  techniques,  which  we  will  not  discuss  here,  show  that  the 
eigenvalues  for  this  particular  operator  are  approximately 

-1.67,  0.51,  1.40,  —0.12  +  1.59/,  -0.12 -  1.59/. 

The  nonreal  eigenvalues  occur  as  a  pair,  with  each  the  complex  conjugate  of 
the  other,  as  expected  for  a  polynomial  with  real  coefficients  (see  4.15). 


EXERCISES  8.C 


1  Suppose  T  G  C4)  is  such  that  the  eigenvalues  of  T  are  3,  5,  8.  Prove 
that  (T  -  3 1)2(T  -  5I)2(T  -  8 1)2  =  0. 

2  Suppose  V  is  a  complex  vector  space.  Suppose  T  g  C(V)  is  such  that  5 
and  6  are  eigenvalues  of  T  and  that  T  has  no  other  eigenvalues.  Prove 
that  (T  —  5I)n~l(T  —  6 1)n~l  =  0,  where  n  —  dim  V. 

3  Give  an  example  of  an  operator  on  C4  whose  characteristic  polynomial 

equals  (z  —  7 )2(z  —  8)2. 
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4  Give  an  example  of  an  operator  on  C4  whose  characteristic  polyno¬ 
mial  equals  (z  —  l)(z  —  5) 3  and  whose  minimal  polynomial  equals 
(z-  l)(z-5)2. 

5  Give  an  example  of  an  operator  on  C4  whose  characteristic  and  minimal 
polynomials  both  equal  z(z  —  l)2(z  —  3). 

6  Give  an  example  of  an  operator  on  C4  whose  characteristic  polyno¬ 
mial  equals  z(z  —  l)2(z  —  3)  and  whose  minimal  polynomial  equals 
z(z  —  l)(z  —  3). 

7  Suppose  V  is  a  complex  vector  space.  Suppose  T  g  C(V)  is  such  that 
P2  =  P .  Prove  that  the  characteristic  polynomial  of  P  is  zm(z  —  l)n , 
where  m  =  dim  null  P  and  n  =  dim  range  P . 

8  Suppose  T  G  £(V).  Prove  that  T  is  invertible  if  and  only  if  the  constant 
term  in  the  minimal  polynomial  of  T  is  nonzero. 

9  Suppose!"  G  C(V)  has  minimal  polynomial  4+5z— 6z2— 7z3+2z4+z5. 
Find  the  minimal  polynomial  of  T~l . 

10  Suppose  V  is  a  complex  vector  space  and  T  G  C(V)  is  invertible. 
Let  p  denote  the  characteristic  polynomial  of  T  and  let  q  denote  the 
characteristic  polynomial  of  T~l .  Prove  that 

q{z)  =  p(\) 

for  all  nonzero  z  G  C. 

11  Suppose  T  G  jC(V)  is  invertible.  Prove  that  there  exists  a  polynomial 
p  G  V(F)  such  that  T~l  =  p(T). 

12  Suppose  V  is  a  complex  vector  space  and  T  g  C(V).  Prove  that  V 
has  a  basis  consisting  of  eigenvectors  of  T  if  and  only  if  the  minimal 
polynomial  of  T  has  no  repeated  zeros. 

[For  complex  vector  spaces,  the  exercise  above  adds  another  equivalence 
to  the  list  given  by  5.41.] 

13  Suppose  V  is  an  inner  product  space  and  T  G  C{V)  is  normal.  Prove 
that  the  minimal  polynomial  of  T  has  no  repeated  zeros. 

14  Suppose  V  is  a  complex  inner  product  space  and  S  G  C(V)  is  an 
isometry.  Prove  that  the  constant  term  in  the  characteristic  polynomial 
of  S  has  absolute  value  1 . 
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15  Suppose  T  g  C(V)  and  v  G  V. 


(a)  Prove  that  there  exists  a  unique  monic  polynomial  p  of  smallest 
degree  such  that  p(T)v  —  0. 

(b)  Prove  that  p  divides  the  minimal  polynomial  of  T. 


16  Suppose  V  is  an  inner  product  space  and  T  G  C(V).  Suppose 

Oq  +  Q\Z  +  +  •  *  *  +  CLm—\Zm  1  +  Zm 


is  the  minimal  polynomial  of  T.  Prove  that 

a0  +  a \z  +  d2,z^  +  •  •  •  +  CLm—\zm  *  +  zm 
is  the  minimal  polynomial  of  T7* 

17  Suppose  F  =  C  and  T  g  jC(V).  Suppose  the  minimal  polynomial  of  T 
has  degree  dim  V.  Prove  that  the  characteristic  polynomial  of  T  equals 
the  minimal  polynomial  of  T. 


18  Suppose  ao, . . . ,  an-\  G  C.  Find  the  minimal  and  characteristic  polyno¬ 
mials  of  the  operator  on  Cn  whose  matrix  (with  respect  to  the  standard 
basis)  is 

/  0  —ao  \ 

10  —  a  i 

1  ’  ’  •  — a  2 


0  -an- 2 

1  an—\  J 


[The  exercise  above  shows  that  every  monic  polynomial  is  the  character¬ 
istic  polynomial  of  some  operator  ] 

19  Suppose  V  is  a  complex  vector  space  and  T  g  jC(V).  Suppose  that 
with  respect  to  some  basis  of  V  the  matrix  of  T  is  upper  triangular,  with 
X\ , . . . ,  Xn  on  the  diagonal  of  this  matrix.  Prove  that  the  characteristic 
polynomial  of  T  is  (z  —  X\)  •  •  •  (z  —  Xn). 


20  Suppose  V  is  a  complex  vector  space  and  V\ , . . . ,  Vm  are  nonzero  sub¬ 
spaces  of  V  such  that  V  =  V\  ©  •  •  •  ©  Vm.  Suppose  T  g  C(V)  and 
each  V j  is  invariant  under  T.  For  each  j ,  let  pj  denote  the  characteristic 
polynomial  of  T\vr  Prove  that  the  characteristic  polynomial  of  T  equals 

P l  *  *  *  Pm • 
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Jordan  Form 


We  know  that  if  V  is  a  complex  vector  space,  then  for  every  T  G  C{V)  there 
is  a  basis  of  V  with  respect  to  which  T  has  a  nice  upper-triangular  matrix  (see 
8.29).  In  this  section  we  will  see  that  we  can  do  even  better — there  is  a  basis 
of  V  with  respect  to  which  the  matrix  of  T  contains  0’s  everywhere  except 
possibly  on  the  diagonal  and  the  line  directly  above  the  diagonal. 

We  begin  by  looking  at  two  examples  of  nilpotent  operators. 

8.53  Example  Let  N  G  £(F4)  be  the  nilpotent  operator  defined  by 

N(zi,z2,z3,z4)  =  (0,z1,z2,z3). 

If  v  =  (1,0, 0, 0),  then  N3v,  N2v ,  Nv,  v  is  a  basis  of  F4.  The  matrix  of  N 
with  respect  to  this  basis  is 

/  0  1  0  0  \ 

0  0  10 

0  0  0  1 

\  0  0  0  0  / 

The  next  example  of  a  nilpotent  operator  has  more  complicated  behavior 
than  the  example  above. 

8.54  Example  Let  N  G  £(F6)  be  the  nilpotent  operator  defined  by 

N (z\ ,  z2,  z3,  z4,  z5,  z6)  =  (0,  zi,  z2,  0,  z4,  0). 

Unlike  the  nice  behavior  of  the  nilpotent  operator  of  the  previous  exam¬ 
ple,  for  this  nilpotent  operator  there  does  not  exist  a  vector  v  G  F6  such 
that  N5v,  N4v,  N3v,  N2v,  Nv,v  is  a  basis  of  F6.  However,  if  we  take 
v\  =  (1,  0,  0,  0,  0,  0),  v2  =  (0,  0,  0, 1, 0, 0),  and  V3  =  (0,  0,  0,  0,  0, 1),  then 
N2v\ ,  Nv\ ,  vi ,  Nv2,  v2,  V3  is  a  basis  of  F6.  The  matrix  of  N  with  respect  to 
this  basis  is 

/  /  0  1  0  \  00  0  \ 

0  0  1  0  0  0 

\  0  0  0  /  00  0 

0  0  0  /  0  1  \  0 
000  V00/  0 

V  000  00  (0)/ 

Here  the  inner  matrices  are  blocked  off  to  show  that  we  can  think  of  the  6-by-6 
matrix  above  as  a  block  diagonal  matrix  consisting  of  a  3-by-3  block  with  l’s 
on  the  line  above  the  diagonal  and  0’s  elsewhere,  a  2-by-2  block  with  1  above 
the  diagonal  and  0’s  elsewhere,  and  a  1-by-l  block  containing  0. 
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Our  next  result  shows  that  every  nilpotent  operator  N  e  C(V)  behaves 
similarly  to  the  previous  example.  Specifically,  there  is  a  finite  collection  of 
vectors  v\, ...  ,vn  e  V  such  that  there  is  a  basis  of  V  consisting  of  the  vectors 
of  the  form  NkVj,  as  j  varies  from  1  to  n  and  k  varies  (in  reverse  order)  from 
0  to  the  largest  nonnegative  integer  mj  such  that  NmjVj  ^  0.  For  the  matrix 
interpretation  of  the  next  result,  see  the  first  part  of  the  proof  of  8.60. 

8.55  Basis  corresponding  to  a  nilpotent  operator 

Suppose  N  e  C(V)  is  nilpotent.  Then  there  exist  vectors  vi, . . . ,  vn  c  V 

and  nonnegative  integers  m\, ...  ,mn  such  that 

(a)  Nmiv\, . . . ,  Nv\,  vi, . . . ,  Nmnvn , . . . ,  Nvn,vn  is  a  basis  of  V ; 

(b)  Nmi  +  lVi  =  ■■■  =  Nmn  +  1vn  =  0. 


Proof  We  will  prove  this  result  by  induction  on  dim  V.  To  get  started,  note 
that  the  desired  result  obviously  holds  if  dim  V  =  1  (in  that  case,  the  only 
nilpotent  operator  is  the  0  operator,  so  take  v\  to  be  any  nonzero  vector  and 
m  i  =0).  Now  assume  that  dim  V  >  1  and  that  the  desired  result  holds  on  all 
vector  spaces  of  smaller  dimension. 

Because  N  is  nilpotent,  N  is  not  injective.  Thus  N  is  not  surjective  (by 
3.69)  and  hence  ranged  is  a  subspace  of  V  that  has  a  smaller  dimension 
than  V.  Thus  we  can  apply  our  induction  hypothesis  to  the  restriction  operator 
N  | range  TV  £  £(rangeA0*  [We  can  ignore  the  trivial  case  range  =  {0}, 
because  in  that  case  N  is  the  0  operator  and  we  can  choose  v\ , . . . ,  vn  to  be 
any  basis  of  V  and  mi  =  •  •  •  =  mn  =  0  to  get  the  desired  result.] 

By  our  induction  hypothesis  applied  to  Af|rangeN,  there  exist  vectors 
...  ,vn  e  range N  and  nonnegative  integers  m \ , . . . , mn  such  that 


8.56 


Nmi vi, . . . ,  Nv\,  vi 


A/"  11  Vfi , . . . ,  Nvn ,  vn 


is  a  basis  of  range  N  and 

Nmi+1v  i  =  •••  =  Nmn  +  1vn  =  0. 

Because  each  vj  is  in  range  N,  for  each  j  there  exists  uj  e  V  such 
that  Vj  =  Nuj.  Thus  Nk+lUj  =  NkVj  for  each  j  and  each  nonnegative 
integer  k.  We  now  claim  that 

8.57  . . . ,  Nu\,  mi,  ... ,  Nmn  +  lun, . . . ,  Nun,  un 
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is  a  linearly  independent  list  of  vectors  in  V.  To  verify  this  claim,  suppose 
that  some  linear  combination  of  8.57  equals  0.  Applying  N  to  that  linear 
combination,  we  get  a  linear  combination  of  8.56  equal  to  0.  However,  the 
list  8.56  is  linearly  independent,  and  hence  all  the  coefficients  in  our  original 
linear  combination  of  8.57  equal  0  except  possibly  the  coefficients  of  the 
vectors 


N 


m  i  +  l 


U  i 


N‘ 


mn  +  l 


U 


n  > 


which  equal  the  vectors 


Nmivi,...,N"‘"vn. 


m, 


Again  using  the  linear  independence  of  the  list  8.56,  we  conclude  that  those 
coefficients  also  equal  0,  completing  our  proof  that  the  list  8.57  is  linearly 
independent. 

Now  extend  8.57  to  a  basis 


8.58  N 


m  i  +  l 


U\ 


Nu\ ,  Mi,  .  .  .  ,  Nl 


mn  + 1 


U 


n  » 


N  un ,  un ,  vvi , . . . ,  w  p 


of  V  (which  is  possible  by  2.33).  Each  Nwj  is  in  range  N  and  hence  is  in  the 
span  of  8.56.  Each  vector  in  the  list  8.56  equals  N  applied  to  some  vector  in 
the  list  8.57.  Thus  there  exists  xj  in  the  span  of  8.57  such  that  Nwj  =  Nxj. 
Now  let 

Un+ j  —  W j  —  %  j  • 

Then  Nun+j  —  0.  Furthermore, 


Nmi+lui7...,Nui,ui,...,N‘ 


mn  + 1 


U 


n  ? 


N un ,  un ,  un- \-i , . . .  ,un-\-p 


spans  V  because  its  span  contains  each  Xj  and  each  un+j  and  hence  each  Wj 
(and  because  8.58  spans  V). 

Thus  the  spanning  list  above  is  a  basis  of  V  because  it  has  the  same  length 
as  the  basis  8.58  (where  we  have  used  2.42).  This  basis  has  the  required  form, 
completing  the  proof.  ■ 


In  the  next  definition,  the  diagonal  of 
each  Aj  is  filled  with  some  eigenvalue 
Xj  of  T,  the  line  directly  above  the  di¬ 
agonal  of  Aj  is  filled  with  l’s,  and  all 
other  entries  in  Aj  are  0  (to  understand  why  each  Xj  is  an  eigenvalue  of  T, 
see  5.32).  The  Ay’s  need  not  be  distinct.  Also,  Aj  may  be  a  1-by-l  matrix 
(Ay)  containing  just  an  eigenvalue  of  T. 


French  mathematician  Camille  Jor¬ 
dan  (1838-1922)  first  published  a 
proof  of  8.60  in  1870. 
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8.59  Definition  Jordan  basis 

Suppose  T  e  C{V).  A  basis  of  V  is  called  a  Jordan  basis  for  T  if  with 
respect  to  this  basis  T  has  a  block  diagonal  matrix 

M  0  \ 

.  ? 

0  Ap) 

where  each  Aj  is  an  upper-triangular  matrix  of  the  form 

A;  1  0  \ 


1 

h ) 


8.60  Jordan  Form 

Suppose  V  is  a  complex  vector  space.  If  T  e  C(V),  then  there  is  a  basis 
of  V  that  is  a  Jordan  basis  for  T. 

Proof  First  consider  a  nilpotent  operator  N  e  C(V)  and  the  vectors 
v\, ...  ,vn  e  V  given  by  8.55.  For  each  j ,  note  that  N  sends  the  first  vector 
in  the  list  Nmj  Vj , . . . ,  Nvj ,  vj  to  0  and  that  N  sends  each  vector  in  this  list 
other  than  the  first  vector  to  the  previous  vector.  In  other  words,  8.55  gives  a 
basis  of  V  with  respect  to  which  N  has  a  block  diagonal  matrix,  where  each 
matrix  on  the  diagonal  has  the  form 

°  1  0  \ 

•  • 

•  • 

•  • 

1 

V»  o) 

Thus  the  desired  result  holds  for  nilpotent  operators. 

Now  suppose  T  e  C(V).  Let  X\ , . . . ,  Xm  be  the  distinct  eigenvalues  of  T. 
We  have  the  generalized  eigenspace  decomposition 

V  =  G(Xi,T)  @  •••  ®  G(Xm,T), 

where  each  (T— Xj /)|g(A7  ,T)  is  nilpotent  (see  8.21).  Thus  some  basis  of  each 
G(Xj ,  T)  is  a  Jordan  basis  for  (T  —  X j I)\G(Xm,T)  (see  previous  paragraph). 
Put  these  bases  together  to  get  a  basis  of  V  that  is  a  Jordan  basis  for  T.  m 
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EXERCISES  8.D 


1  Find  the  characteristic  polynomial  and  the  minimal  polynomial  of  the 
operator  N  in  Example  8.53. 

2  Find  the  characteristic  polynomial  and  the  minimal  polynomial  of  the 
operator  N  in  Example  8.54. 

3  Suppose  N  G  C(V)  is  nilpotent.  Prove  that  the  minimal  polynomial  of 
N  is  zm+1,  where  m  is  the  length  of  the  longest  consecutive  string  of 
l’s  that  appears  on  the  line  directly  above  the  diagonal  in  the  matrix  of 
N  with  respect  to  any  Jordan  basis  for  N. 

4  Suppose  T  g  C(V)  and  vi , . . . ,  vn  is  a  basis  of  V  that  is  a  Jordan  basis 
for  T.  Describe  the  matrix  of  T  with  respect  to  the  basis  vn,  •  •  • ,  vi 
obtained  by  reversing  the  order  of  the  v’s. 

5  Suppose  T  g  C(V)  and  vi , . . . ,  vn  is  a  basis  of  V  that  is  a  Jordan  basis 
for  T.  Describe  the  matrix  of  T2  with  respect  to  this  basis. 

6  Suppose  N  G  C(V)  is  nilpotent  and  v\, ...  ,vn  and  m\, ...  ,mn  are  as 
in  8.55.  Prove  that  Nmiv\, . . . ,  Nmnvn  is  a  basis  of  null  N. 

[The  exercise  above  implies  that  n,  which  equals  dim  null  N,  depends 
only  on  N  and  not  on  the  specific  Jordan  basis  chosen  for  N.\ 

7  Suppose  p,q  G  V(C)  are  monic  polynomials  with  the  same  zeros  and  q 
is  a  polynomial  multiple  of  p.  Prove  that  there  exists  T  G  C(CdQgq)  such 
that  the  characteristic  polynomial  of  T  is  q  and  the  minimal  polynomial 
of  T  is  p. 

8  Suppose  V  is  a  complex  vector  space  and  T  g  C(V).  Prove  that  there 
does  not  exist  a  direct  sum  decomposition  of  V  into  two  proper  subspaces 
invariant  under  T  if  and  only  if  the  minimal  polynomial  of  T  is  of  the 
form  (z  —  X)dimV  for  some  A  G  C. 
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Operators  on  Real  Vector  Spaces 


In  the  last  chapter  we  learned  about  the  structure  of  an  operator  on  a  finite- 
dimensional  complex  vector  space.  In  this  chapter,  we  will  use  our  results 
about  operators  on  complex  vector  spaces  to  learn  about  operators  on  real 
vector  spaces. 

Our  assumptions  for  this  chapter  are  as  follows: 

9.1  Notation  F,  V 

•  F  denotes  R  or  C. 

•  V  denotes  a  finite-dimensional  nonzero  vector  space  over  F. 


LEARNING  OBJECTIVES  FOR  THIS  CHAPTER 

■  complexification  of  a  real  vector  space 

■  complexification  of  an  operator  on  a  real  vector  space 

■  operators  on  finite-dimensional  real  vector  spaces  have  an 
eigenvalue  or  a  2-dimensional  invariant  subspace 

■  characteristic  polynomial  and  the  Cayley-Hamilton  Theorem 

■  description  of  normal  operators  on  a  real  inner  product  space 

■  description  of  isometries  on  a  real  inner  product  space 
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Complexification 

Complexification  of  a  Vector  Space 

As  we  will  soon  see,  a  real  vector  space  F  can  be  embedded,  in  a  natural  way, 
in  a  complex  vector  space  called  the  complexification  of  F.  Each  operator 
on  F  can  be  extended  to  an  operator  on  the  complexification  of  F.  Our 
results  about  operators  on  complex  vector  spaces  can  then  be  translated  to 
information  about  operators  on  real  vector  spaces. 

We  begin  by  defining  the  complexification  of  a  real  vector  space. 

9.2  Definition  complexification  of  V,  Fc 
Suppose  F  is  a  real  vector  space. 

•  The  complexification  of  F,  denoted  Fc  ,  equals  F  x  F  An  element 
of  Vc  is  an  ordered  pair  (u,  v),  where  u,v  e  V,  but  we  will  write 
this  as  u  +  iv. 

•  Addition  on  Vc  is  defined  by 

(u  i  +  iv  i)  +  {u  2  +  iv2)  =  (wi  +  ^2)  +  i(vi  +  V2) 
for  mi,  vi,  U2,  V2  e  F 

•  Complex  scalar  multiplication  on  Vc  is  defined  by 

(< a  +  bi)(u  +  iv)  =  ( au  —  bv)  +  i(av  +  bu ) 
for  a,  b  e  R  and  w,  v  G  F. 

Motivation  for  the  definition  above  of  complex  scalar  multiplication  comes 
from  usual  algebraic  properties  and  the  identity  / 2  =  —  1 .  If  you  remember 
the  motivation,  then  you  do  not  need  to  memorize  the  definition  above. 

We  think  of  F  as  a  subset  of  Fc  by  identifying  u  e  F  with  u  +  iO. 
The  construction  of  Fc  from  F  can  then  be  thought  of  as  generalizing  the 
construction  of  Cn  from  Kn . 

9.3  Fc  is  a  complex  vector  space. 

Suppose  F  is  a  real  vector  space.  Then  with  the  definitions  of  addition 
and  scalar  multiplication  as  above,  Fc  is  a  complex  vector  space. 

The  proof  of  the  result  above  is  left  as  an  exercise  for  the  reader.  Note  that 
the  additive  identity  of  Fc  is  0  +  i  0,  which  we  write  as  just  0. 
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Probably  everything  that  you  think  should  work  concerning  complexifica¬ 
tion  does  work,  usually  with  a  straightforward  verification,  as  illustrated  by 
the  next  result. 

9.4  Basis  of  V  is  basis  of  Vc 
Suppose  V  is  a  real  vector  space. 

(a)  If  vi , . . . ,  vn  is  a  basis  of  V  (as  a  real  vector  space),  then  v\ , . . . ,  vn 
is  a  basis  of  Vc  (as  a  complex  vector  space). 

(b)  The  dimension  of  Vq  (as  a  complex  vector  space)  equals  the  dimen¬ 
sion  of  V  (as  a  real  vector  space). 

Proof  To  prove  (a),  suppose  v\ , . . . ,  vn  is  a  basis  of  the  real  vector  space  V. 
Then  span(vi , . . . ,  vn)  in  the  complex  vector  space  Vc  contains  all  the  vectors 
vi , . . . ,  vn ,i v\ ,...,/ vn .  Thus  v\, ...  ,vn  spans  the  complex  vector  space  Vq . 

To  show  that  v\, . . . ,  vn  is  linearly  independent  in  the  complex  vector 
space  Vc,  suppose  X\, ...  ,Xn  e  C  and 


X\v\  +  •  •  •  +  Xnvn  =  0. 

Then  the  equation  above  and  our  definitions  imply  that 

(ReAi)vi  H - b  (ReAw)vw  =  0  and  (ImAi)vi  H - b  {\mXn)vn  =  0. 

Because  vi, . . . ,  vn  is  linearly  independent  in  V,  the  equations  above  imply 
Re  A  i  =  •••  =  ReXn  =0  and  Im  A  i  =  •••  =  \mn  =  0.  Thus  we  have 
X\  =  •••  =  Xn  =  0.  Hence  vi,...,vw  is  linearly  independent  in  Vc, 
completing  the  proof  of  (a). 

Clearly  (b)  follows  immediately  from  (a).  ■ 

Complexification  of  an  Operator 

Now  we  can  define  the  complexification  of  an  operator. 

9.5  Definition  complexification  of  Ty  Tq 

Suppose  V  is  a  real  vector  space  and  T  e  C(V).  The  complexification  of 
T,  denoted  7c,  is  the  operator  Tq  €  C(V c)  defined  by 

7c  (u  +  i  v)  =  Tu  +  iTv 


for  u,  v  G  V. 
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You  should  verify  that  if  V  is  a  real  vector  space  and  T  e  £(V),  then  Tq 
is  indeed  in  £(Vc).  The  key  point  here  is  that  our  definition  of  complex  scalar 
multiplication  can  be  used  to  show  that  Tq(X(u  +  iv))  =  XTq(u  +  iv)  for 
all  u,  v  G  V  and  all  complex  numbers  X. 

The  next  example  gives  a  good  way  to  think  about  the  complexification  of 
a  typical  operator. 


9.6  Example  Suppose  A  is  an  n-by-n  matrix  of  real  numbers.  Define 
T  e  £(RW)  by  Tx  —  Ax,  where  elements  of  Kn  are  thought  of  as  n- by-1 
column  vectors.  Identifying  the  complexification  of  with  Cn,  we  then 
have  Tqz  =  Az  for  each  z  e  Cn,  where  again  elements  of  Cn  are  thought  of 
as  n -by-1  column  vectors. 

In  other  words,  if  T  is  the  operator  of  matrix  multiplication  by  donR”, 
then  the  complexification  Tq  is  also  matrix  multiplication  by  A  but  now  acting 
on  the  larger  domain  Cn . 

The  next  result  makes  sense  because  9.4  tells  us  that  a  basis  of  a  real  vector 
space  is  also  a  basis  of  its  complexification.  The  proof  of  the  next  result 
follows  immediately  from  the  definitions. 

9.7  Matrix  of  7c  equals  matrix  of  T 

Suppose  V  is  a  real  vector  space  with  basis  v\, . . . ,  vn  and  T  e  C(V). 

Then  M(T )  =  where  both  matrices  are  with  respect  to  the  basis 

Tl  >  •  •  •  >  Vjj  . 

The  result  above  and  Example  9.6  provide  complete  insight  into  complexi¬ 
fication,  because  once  a  basis  is  chosen,  every  operator  essentially  looks  like 
Example  9.6.  Complexification  of  an  operator  could  have  been  defined  using 
matrices,  but  the  approach  taken  here  is  more  natural  because  it  does  not 
depend  on  the  choice  of  a  basis. 

We  know  that  every  operator  on  a  nonzero  finite-dimensional  complex 
vector  space  has  an  eigenvalue  (see  5.21)  and  thus  has  a  1 -dimensional  in¬ 
variant  subspace.  We  have  seen  an  example  [5.8(a)]  of  an  operator  on  a 
nonzero  finite-dimensional  real  vector  space  with  no  eigenvalues  and  thus  no 
1 -dimensional  invariant  subspaces.  However,  we  now  show  that  an  invariant 
subspace  of  dimension  1  or  2  always  exists.  Notice  how  complexification 
leads  to  a  simple  proof  of  this  result. 
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9.8  Every  operator  has  an  invariant  subspace  of  dimension  1  or  2 

Every  operator  on  a  nonzero  finite-dimensional  vector  space  has  an 
invariant  sub  space  of  dimension  1  or  2. 

Proof  Every  operator  on  a  nonzero  finite-dimensional  complex  vector  space 
has  an  eigenvalue  (5.21)  and  thus  has  a  1-dimensional  invariant  subspace. 

Hence  assume  V  is  a  real  vector  space  and  T  e  C{V).  The  complexifica¬ 
tion  7c  has  an  eigenvalue  a  +  bi  (by  5.21),  where  a,b  e  R.  Thus  there  exist 
u,v  G  V,  not  both  0,  such  that  7c  (u  +  iv)  =  (a  +  bi)(u  +  /  v).  Using  the 
definition  of  7c,  the  last  equation  can  be  rewritten  as 

Tu  +  iTv  =  ( au  —  bv)  +  (av  +  bu)i. 

Thus 

Tu  =  au  —  bv  and  Tv  =  av  +  bu. 

Let  U  equal  the  span  in  V  of  the  list  u,  v.  Then  U  is  a  subspace  of  V 
with  dimension  1  or  2.  The  equations  above  show  that  U  is  invariant  under  T, 
completing  the  proof.  ■ 

The  Minimal  Polynomial  of  the  Complexification 

Suppose  V  is  a  real  vector  space  and  T  e  C{V).  Repeated  application  of  the 
definition  of  Tq  shows  that 

9.9  (7c)”  (m  +  iv)  =  Tnu  +  iTnv 

for  every  positive  integer  n  and  all  u,  v  e  V. 

Notice  that  the  next  result  implies  that  the  minimal  polynomial  of  Tq  has 
real  coefficients. 

9.io  Minimal  polynomial  of  Tc  equals  minimal  polynomial  of  T 

Suppose  V  is  a  real  vector  space  and  T  e  C{V).  Then  the  minimal 
polynomial  of  Tq  equals  the  minimal  polynomial  of  T. 

Proof  Let  p  e  ^(R)  denote  the  minimal  polynomial  of  T.  From  9.9  it  is 
easy  to  see  that  p(Tc)  —  (t,(T))c,  and  thus  p(Tc)  —  0. 

Suppose  q  e  V(C)  is  a  monic  polynomial  such  that  q(Tc)  —  0-  Then 
(g(7c))(w)  =  0  for  every  u  e  V.  Letting  r  denote  the  polynomial  whose  j th 
coefficient  is  the  real  part  of  the  j th  coefficient  of  q,  we  see  that  r  is  a  monic 
polynomial  and  r{T)  =  0.  Thus  degg  =  degr  >  deg  p. 

The  conclusions  of  the  two  previous  paragraphs  imply  that  p  is  the  minimal 
polynomial  of  Tq  ,  as  desired.  ■ 
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Eigenvalues  of  the  Complexification 

Now  we  turn  to  questions  about  the  eigenvalues  of  the  complexification  of  an 
operator.  Again,  everything  that  we  expect  to  work  indeed  works  easily. 

We  begin  with  a  result  showing  that  the  real  eigenvalues  of  Tq  are  precisely 
the  eigenvalues  of  T.  We  give  two  different  proofs  of  this  result.  The  first 
proof  is  more  elementary,  but  the  second  proof  is  shorter  and  gives  some 
useful  insight. 

9.1 1  Real  eigenvalues  of  Tc 

Suppose  V  is  a  real  vector  space,  T  e  £(F),  and  A  e  R.  Then  A  is  an 
eigenvalue  of  Tq  if  and  only  if  A  is  an  eigenvalue  of  T. 

Proof  1  First  suppose  A  is  an  eigenvalue  of  T.  Then  there  exists  v  e  V 
with  v  /  0  such  that  Tv  =  Av.  Thus  Tq  v  =  Av,  which  shows  that  A  is  an 
eigenvalue  of  Tq  ,  completing  one  direction  of  the  proof. 

To  prove  the  other  direction,  suppose  now  that  A  is  an  eigenvalue  of  Tq  . 
Then  there  exist  u,  v  e  V  with  m  +  /v/0  such  that 

TC  (u  +  iv)  —  A  (u  +  i  v) . 

The  equation  above  implies  that  Tu  =  Xu  and  Tv  —  Av.  Because  ii  /  0  or 
v/0,  this  implies  that  A  is  an  eigenvalue  of  T,  completing  the  proof.  ■ 

Proof  2  The  (real)  eigenvalues  of  T  are  the  (real)  zeros  of  the  minimal 
polynomial  of  T  (by  8.49).  The  real  eigenvalues  of  Tq  are  the  real  zeros  of  the 
minimal  polynomial  of  Tq  (again  by  8.49).  These  two  minimal  polynomials 
are  the  same  (by  9.10).  Thus  the  eigenvalues  of  T  are  precisely  the  real 
eigenvalues  of  7c,  as  desired.  ■ 

Our  next  result  shows  that  Tq  behaves  symmetrically  with  respect  to  an 
eigenvalue  A  and  its  complex  conjugate  A. 

9.12  7c  -  A7  and  7c  -  A7 

Suppose  V  is  a  real  vector  space,  T  e£(F),AeC,  7  is  a  nonnegative 
integer,  and  u,v  e  V.  Then 


(7c  —  A7)7  (u  +  iv)  =  0  if  and  only  if  (7c  —  A7)7  ( u  —  iv)  =  0. 
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Proof  We  will  prove  this  result  by  induction  on  j .  To  get  started,  note  that 
if  j  =  0  then  (because  an  operator  raised  to  the  power  0  equals  the  identity 
operator)  the  result  claims  that  u  +  i  v  =  0  if  and  only  if  u  —  i  v  =  0,  which  is 
clearly  true. 

Thus  assume  by  induction  that  j  >  1  and  the  desired  result  holds  for  j  —  1. 
Suppose  (7c  —  Xiy\u  +  iv)  =  0.  Then 

9.13  (7c  -XI)j~l((Tc  -A/)(w  +  /v))  =  0. 

Writing  A  =  a  +  bi,  where  a,  b  e  R,  we  have 

9.14  (7c  —  XI)(u  +  iv)  =  (Tu  —  au  +  bv)  +  i(Tv  —  av  —  bu) 
and 

9.15  (7c  —  XI)(u  —  iv)  =  (Tu  —  au  +  bv)  —  i(Tv  —  av  —  bu). 

Our  induction  hypothesis,  9.13,  and  9.14  imply  that 

(7c  —  A7)7_1  ((T u  —  au  +  bv)  —  i(Tv  —  av  —  bu))  =  0. 

Now  the  equation  above  and  9.15  imply  that  (7c  —  XI)J  (u  —  iv)  =  0, 
completing  the  proof  in  one  direction. 

The  other  direction  is  proved  by  replacing  X  with  A,  replacing  v  with  —  v, 
and  then  using  the  first  direction.  ■ 

An  important  consequence  of  the  result  above  is  the  next  result,  which 
states  that  if  a  number  is  an  eigenvalue  of  7c,  then  its  complex  conjugate  is 
also  an  eigenvalue  of  Tq  . 

9.16  Nonreal  eigenvalues  of  Tc  come  in  pairs 

Suppose  V  is  a  real  vector  space,  T  e  C(V),  and  A  e  C.  Then  A  is  an 
eigenvalue  of  Tq  if  and  only  if  A  is  an  eigenvalue  of  Tq  . 

Proof  Take  j  —  1  in  9.12.  ■ 

By  definition,  the  eigenvalues  of  an  operator  on  a  real  vector  space  are 
real  numbers.  Thus  when  mathematicians  sometimes  informally  mention  the 
complex  eigenvalues  of  an  operator  on  a  real  vector  space,  what  they  have  in 
mind  is  the  eigenvalues  of  the  complexification  of  the  operator. 

Recall  that  the  multiplicity  of  an  eigenvalue  is  defined  to  be  the  dimension 
of  the  generalized  eigenspace  corresponding  to  that  eigenvalue  (see  8.24).  The 
next  result  states  that  the  multiplicity  of  an  eigenvalue  of  a  complexification 
equals  the  multiplicity  of  its  complex  conjugate. 
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9.17  Multiplicity  of  A  equals  multiplicity  of  A 

Suppose  V  is  a  real  vector  space,  T  e  C(V),  and  A  e  C  is  an  eigenvalue 
of  7c.  Then  the  multiplicity  of  A  as  an  eigenvalue  of  7c  equals  the 
multiplicity  of  A  as  an  eigenvalue  of  7c . 

Proof  Suppose  u\  +  iv\, . . .  ,um  +  ivm  is  a  basis  of  the  generalized 
eigenspace  G(A,  7c),  where  u\, . . . ,  um,  vi, . . . ,  vm  e  V.  Then  using  9.12, 
routine  arguments  show  that  u\  —  iv\, . . . ,  um  —  ivm  is  a  basis  of  the  gen¬ 
eralized  eigenspace  G(A,  7c).  Thus  both  A  and  A  have  multiplicity  m  as 
eigenvalues  of  7c .  ■ 


9.18  Example  Suppose  T  e  £(R3)  is  defined  by 

T(x i,x2,x3)  =  (2xi,x2  -  x3,x2  +  x3). 

3  (2  0  0 

The  matrix  of  T  with  respect  to  the  standard  basis  of  is  I  0  1  —1 

\  0  1  1 

As  you  can  verify,  2  is  an  eigenvalue  of  T  with  multiplicity  1  and  T  has  no 
other  eigenvalues. 

If  we  identify  the  complexification  of  R3  with  C3,  then  the  matrix  of  7c 
with  respect  to  the  standard  basis  of  C3  is  the  matrix  above.  As  you  can 
verify,  the  eigenvalues  of  Tq  are  2,  1  +  i ,  and  1  —  i ,  each  with  multiplicity 
1 .  Thus  the  nonreal  eigenvalues  of  Tq  come  as  a  pair,  with  each  the  complex 
conjugate  of  the  other  and  with  the  same  multiplicity,  as  expected  by  9.17. 

We  have  seen  an  example  [5.8(a)]  of  an  operator  on  R2  with  no  eigenvalues. 
The  next  result  shows  that  no  such  example  exists  on  R3. 

9.19  Operator  on  odd-dimensional  vector  space  has  eigenvalue 

Every  operator  on  an  odd-dimensional  real  vector  space  has  an  eigenvalue. 

Proof  Suppose  V  is  a  real  vector  space  with  odd  dimension  and  T  e  C(V). 
Because  the  nonreal  eigenvalues  of  Tq  come  in  pairs  with  equal  multiplicity 
(by  9.17),  the  sum  of  the  multiplicities  of  all  the  nonreal  eigenvalues  of  Tq  is 
an  even  number. 

Because  the  sum  of  the  multiplicities  of  all  the  eigenvalues  of  Tq  equals 
the  (complex)  dimension  of  Vq  (by  Theorem  8.26),  the  conclusion  of  the 
paragraph  above  implies  that  Tq  has  a  real  eigenvalue.  Every  real  eigenvalue 
of  7c  is  also  an  eigenvalue  of  T  (by  9.1 1),  giving  the  desired  result.  ■ 
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Characteristic  Polynomial  of  the  Complexification 

In  the  previous  chapter  we  defined  the  characteristic  polynomial  of  an  operator 
on  a  finite-dimensional  complex  vector  space  (see  8.34).  The  next  result  is 
a  key  step  toward  defining  the  characteristic  polynomial  for  operators  on 
finite-dimensional  real  vector  spaces. 

9.20  Characteristic  polynomial  of  Tc 

Suppose  V  is  a  real  vector  space  and  T  e  C(V).  Then  the  coefficients  of 
the  characteristic  polynomial  of  Tq  are  all  real. 


Proof  Suppose  A  is  a  nonreal  eigenvalue  of  Tq  with  multiplicity  m.  Then  A  is 
also  an  eigenvalue  of  Tq  with  multiplicity  m  (by  9.17).  Thus  the  characteristic 
polynomial  of  Tq  includes  factors  of  (z  —  X)m  and  (z  —  X)m .  Multiplying 
together  these  two  factors,  we  have 


\2\m 


(. z  -  X)m  (z  -  l)m  =  (z2  -  2  (Re  A)z  +  I A  I2) 


The  polynomial  above  on  the  right  has  real  coefficients. 

The  characteristic  polynomial  of  Tq  is  the  product  of  terms  of  the  form 
above  and  terms  of  the  form  (z  —  t)d ,  where  t  is  a  real  eigenvalue  of  Tq  with 
multiplicity  d .  Thus  the  coefficients  of  the  characteristic  polynomial  of  Tq 
are  all  real.  ■ 


Now  we  can  define  the  characteristic  polynomial  of  an  operator  on  a 
finite-dimensional  real  vector  space  to  be  the  characteristic  polynomial  of  its 
complexification. 


9.21  Definition  Characteristic  polynomial 

Suppose  V  is  a  real  vector  space  and  T  e  C(V).  Then  the  characteristic 
polynomial  of  T  is  defined  to  be  the  characteristic  polynomial  of  Tq  . 


9.22  Example  Suppose  T  e  £(R3)  is  defined  by 

T(x l,X2,X3)  =  (2xi,  X2  -X3,X2  +X3). 

As  we  noted  in  9.18,  the  eigenvalues  of  Tq  are  2,  1  +  /  ,  and  1  —  / ,  each  with 
multiplicity  1 .  Thus  the  characteristic  polynomial  of  the  complexification  Tq 
is  (z  —  2)(z  —  (1  +  0)(z  —  (1  —  i)),  which  equals  z3  —  4z2  +  6z  —  4.  Hence 
the  characteristic  polynomial  of  T  is  also  z3  —  4z2  +  6z  —  4. 
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In  the  next  result,  the  eigenvalues  of  T  are  all  real  (because  T  is  an  operator 
on  a  real  vector  space). 

9.23  Degree  and  zeros  of  characteristic  polynomial 
Suppose  V  is  a  real  vector  space  and  T  e  C(V).  Then 

(a)  the  coefficients  of  the  characteristic  polynomial  of  T  are  all  real; 

(b)  the  characteristic  polynomial  of  T  has  degree  dim  V ; 

(c)  the  eigenvalues  of  T  are  precisely  the  real  zeros  of  the  characteristic 
polynomial  of  T. 

Proof  Part  (a)  holds  because  of  9.20. 

Part  (b)  follows  from  8.36(a). 

Part  (c)  holds  because  the  real  zeros  of  the  characteristic  polynomial  of  T 
are  the  real  eigenvalues  of  Tq  [by  8.36(a)],  which  are  the  eigenvalues  of  T 
(by  9.11).  ■ 

In  the  previous  chapter,  we  proved  the  Cayley-Hamilton  Theorem  (8.37) 
for  complex  vector  spaces.  Now  we  can  also  prove  it  for  real  vector  spaces. 

9.24  Cayley-Hamilton  Theorem 

Suppose  T  e  C(V).  Let  q  denote  the  characteristic  polynomial  of  T. 
Then  q(T)  =  0. 

Proof  We  have  already  proved  this  result  when  V  is  a  complex  vector  space. 
Thus  assume  that  V  is  a  real  vector  space. 

The  complex  case  of  the  Cayley-Hamilton  Theorem  (8.37)  implies  that 
q(Tc)  =  0.  Thus  we  also  have  q(T)  =  0,  as  desired.  ■ 


9.25  Example  Suppose  T  e  £(R3)  is  defined  by 

T(x  1,X2,X3)  =  (2xi,  *2  X3 ,  X2  +  X3). 

As  we  saw  in  9.22,  the  characteristic  polynomial  of  T  is  z3  —  4z2  +  6z  —  4. 
Thus  the  Cayley-Hamilton  Theorem  implies  that  T3  —  AT 2  +  67"  —  4/  =  0, 
which  can  also  be  verified  by  direct  calculation. 

We  can  now  prove  another  result  that  we  previously  knew  only  in  the 
complex  case. 
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9.26  Characteristic  polynomial  is  a  multiple  of  minimal  polynomial 
Suppose  T  e  C(V).  Then 

(a)  the  degree  of  the  minimal  polynomial  of  T  is  at  most  dim  V ; 

(b)  the  characteristic  polynomial  of  T  is  a  polynomial  multiple  of  the 
minimal  polynomial  of  T. 

Proof  Part  (a)  follows  immediately  from  the  Cayley-Hamilton  Theorem. 
Part  (b)  follows  from  the  Cayley-Hamilton  Theorem  and  8.46.  ■ 


EXERCISES  9. A 


1  Prove  9.3. 

2  Verify  that  if  V  is  a  real  vector  space  and  T  e  C(V),  then  7c  e  C(V c). 

3  Suppose  V  is  a  real  vector  space  and  v\, . . . ,  vm  e  V.  Prove  that 

...  ,vm  is  linearly  independent  in  Vq  if  and  only  if  v\ , . . . , vm  is 
linearly  independent  in  V. 

4  Suppose  V  is  a  real  vector  space  and  v\, . . . ,  vm  e  V.  Prove  that 

vi , . . . ,  vm  spans  Vq  if  and  only  if  v\ , . . . ,  vm  spans  V. 

5  Suppose  that  V  is  a  real  vector  space  and  S,T  e  C{V).  Show  that 

(S  +  T) c  =  Sc  +  7c  and  that  (XT) c  =  XTq  for  every  X  e  R. 

6  Suppose  V  is  a  real  vector  space  and  T  e  C(V).  Prove  that  Tq  is 
invertible  if  and  only  if  T  is  invertible. 

7  Suppose  V  is  a  real  vector  space  and  N  e  C(V).  Prove  that  Nq  is 
nilpotent  if  and  only  if  N  is  nilpotent. 

8  Suppose  T  e  >C(R3 4 5 6)  and  5,  7  are  eigenvalues  of  T.  Prove  that  Tq  has  no 
nonreal  eigenvalues. 

9  Prove  there  does  not  exist  an  operator  T  e  £(R7 8 9 10)  such1   that  T2  +  T  +  7 
is  nilpotent. 

10  Give  an  example  of  an  operator  T  e  C(C7)  such  that  T2  +  T  +  7 
is  nilpotent. 
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11  Suppose  V  is  a  real  vector  space  and  T  e  C(V).  Suppose  there  exist 
b,  c  G  R  such  that  T 2  +  bT  +  cl  =  0.  Prove  that  T  has  an  eigenvalue 
if  and  only  if  b2  >  4c. 

12  Suppose  V  is  a  real  vector  space  and  T  e  £(V).  Suppose  there  exist 
b,  c  G  R  such  that  b2  <  4c  and  T 2  +  bT  +  cl  is  nilpotent.  Prove  that 
T  has  no  eigenvalues. 

13  Suppose  V  is  a  real  vector  space,  T  e  £(F),  and  b,  c  e  R  are  such  that 
b 2  <  4c.  Prove  that  null(T2  +  bT  +  cl)j  has  even  dimension  for  every 
positive  integer  j . 

14  Suppose  V  is  a  real  vector  space  with  dim  V  =  8.  Suppose  T  e  C(V) 
is  such  that  T2  +  T  +  I  is  nilpotent.  Prove  that  ( T 2  +  T  +  /)4  =  0. 

15  Suppose  V  is  a  real  vector  space  and  T  e  C(V)  has  no  eigenvalues. 
Prove  that  every  subspace  of  V  invariant  under  T  has  even  dimension. 

16  Suppose  V  is  a  real  vector  space.  Prove  that  there  exists  T  e  C{V)  such 
that  T2  =  —I  if  and  only  if  V  has  even  dimension. 

17  Suppose  V  is  a  real  vector  space  and  T  e  C(V)  satisfies  T2  =  —I. 
Define  complex  scalar  multiplication  on  V  as  follows:  if  a,b  £  R,  then 

(a  +  bi)v  =  av  +  bTv. 

(a)  Show  that  the  complex  scalar  multiplication  on  V  defined  above 
and  the  addition  on  V  makes  V  into  a  complex  vector  space. 

(b)  Show  that  the  dimension  of  V  as  a  complex  vector  space  is  half 
the  dimension  of  V  as  a  real  vector  space. 

18  Suppose  V  is  a  real  vector  space  and  T  e  C(V).  Prove  that  the  following 
are  equivalent: 

(a)  All  the  eigenvalues  of  Tq  are  real. 

(b)  There  exists  a  basis  of  V  with  respect  to  which  T  has  an  upper- 
triangular  matrix. 

(c)  There  exists  a  basis  of  V  consisting  of  generalized  eigenvectors 
of  T. 

19  Suppose  V  is  a  real  vector  space  with  dim  V  —  n  and  T  e  C(V)  is 
such  that  null  Tn~2  ^  null  Tn~l .  Prove  that  T  has  at  most  two  distinct 
eigenvalues  and  that  Tq  has  no  nonreal  eigenvalues. 
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Operators  on  Real  Inner  Product  Spaces 

We  now  switch  our  focus  to  the  context  of  inner  product  spaces.  We  will  give 
a  complete  description  of  normal  operators  on  real  inner  product  spaces;  a 
key  step  in  the  proof  of  this  result  (9.34)  requires  the  result  from  the  previous 
section  that  an  operator  on  a  finite-dimensional  real  vector  space  has  an 
invariant  subspace  of  dimension  1  or  2  (9.8). 

After  describing  the  normal  operators  on  real  inner  product  spaces,  we  will 
use  that  result  to  give  a  complete  description  of  isometries  on  such  spaces. 


9.B 


Normal  Operators  on  Real  Inner  Product  Spaces 

The  Complex  Spectral  Theorem  (7.24)  gives  a  complete  description  of  normal 
operators  on  complex  inner  product  spaces.  In  this  subsection  we  will  give  a 
complete  description  of  normal  operators  on  real  inner  product  spaces. 

We  begin  with  a  description  of  the  operators  on  2-dimensional  real  inner 
product  spaces  that  are  normal  but  not  self-adjoint. 


9.27  Normal  but  not  self-adjoint  operators 

Suppose  V  is  a  2-dimensional  real  inner  product  space  and  T  e  C{V). 
Then  the  following  are  equivalent: 


(a)  T  is  normal  but  not  self-adjoint. 

(b)  The  matrix  of  T  with  respect  to  every  orthonormal  basis  of  V  has 
the  form 

a  —b 
b  a 

with  b  ^  0. 

(c)  The  matrix  of  T  with  respect  to  some  orthonormal  basis  of  V  has 
the  form 

a  —b 
b  a 

with  b  >  0. 

Proof  First  suppose  (a)  holds,  so  that  T  is  normal  but  not  self-adjoint.  Let 
e\ ,  £2  be  an  orthonormal  basis  of  V.  Suppose 


M(T,(e  i,e2))  = 


a  c 

b  d 


9.28 
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2  =  a2  +  b2  and  ||r*Pi||2  =  a2  +  c2.  Because  T  is  normal, 
T*e 1 1|  (see  7.20);  thus  these  equations  imply  that  b 2  =  c2 .  Thus 
c  =  b  or  c  =  —  b.  But  c  ^  b,  because  otherwise  T  would  be  self-adjoint,  as 
can  be  seen  from  the  matrix  in  9.28.  Hence  c  =  —b,  so 


Then  \\Te\ 
WTed  = 


9.29 


M(T,(e  i,e2))  = 


The  matrix  of  T*  is  the  transpose  of  the  matrix  above.  Use  matrix  multipli¬ 
cation  to  compute  the  matrices  of  T  T*  and  T*T  (do  it  now).  Because  T  is 
normal,  these  two  matrices  are  equal.  Equating  the  entries  in  the  upper-right 
corner  of  the  two  matrices  you  computed,  you  will  discover  that  bd  —  ab. 
Now  b  7^  0,  because  otherwise  T  would  be  self-adjoint,  as  can  be  seen  from 
the  matrix  in  9.29.  Thus  d  =  a,  completing  the  proof  that  (a)  implies  (b). 

Now  suppose  (b)  holds.  We  want  to  prove  that  (c)  holds.  Choose  an 
orthonormal  basis  e\ ,  of  V.  We  know  that  the  matrix  of  T  with  respect  to 
this  basis  has  the  form  given  by  (b),  with  b  ^  0.  If  b  >  0,  then  (c)  holds 
and  we  have  proved  that  (b)  implies  (c).  If  b  <  0,  then,  as  you  should  verify, 
the  matrix  of  T  with  respect  to  the  orthonormal  basis  e\ ,  —  equals  ( ba ), 
where  —b  >  0;  thus  in  this  case  we  also  see  that  (b)  implies  (c). 

Now  suppose  (c)  holds,  so  that  the  matrix  of  T  with  respect  to  some 
orthonormal  basis  has  the  form  given  in  (c)  with  b  >  0.  Clearly  the  matrix 
of  T  is  not  equal  to  its  transpose  (because  b  ^  0).  Hence  T  is  not  self-adjoint. 
Now  use  matrix  multiplication  to  verify  that  the  matrices  0f  TT*  and  T*T 
are  equal.  We  conclude  that  TT*  =  T*T.  Hence  T  is  normal.  Thus  (c) 
implies  (a),  completing  the  proof.  ■ 


The  next  result  tells  us  that  a  normal  operator  restricted  to  an  invariant 
subspace  is  normal.  This  will  allow  us  to  use  induction  on  dim  V  when  we 
prove  our  description  of  normal  operators  (9.34). 


9.30  Normal  operators  and  invariant  subspaces 

Suppose  V  is  an  inner  product  space,  T  e  jC(V)  is  normal,  and  U  is  a 
subspace  of  V  that  is  invariant  under  T.  Then 


(a)  U1-  is  invariant  under  T ; 

(b)  U  is  invariant  under  T*; 


(c)  (7V)*  =  (T*V; 


(d)  T\jj  e  C{U )  and  T\jj±  e  CiU1-)  are  normal  operators. 
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Proof  First  we  will  prove  (a).  Let  e\, . . . ,  em  be  an  orthonormal  basis 
of  U.  Extend  to  an  orthonormal  basis  e\ , . . . ,  em,  f\ , . . . ,  fn  of  V  (this  is 
possible  by  6.35).  Because  U  is  invariant  under  T,  each  Tej  is  a  linear 
combination  of  e\, . . . ,  em.  Thus  the  matrix  of  T  with  respect  to  the  basis 
e\ , . . . ,  em ,  /i , . . . ,  fn  is  of  the  form 


M(T )  = 


e\ 


c\  ...  em  f\  ...  fn 


fl 


fn  V 


\ 


A 


B 


0 


C 


7 


here  A  denotes  an  m-by-m  matrix,  0  denotes  the  n-by-m  matrix  of  all  0’s,  B 
denotes  an  m-by-n  matrix,  C  denotes  an  n-by-n  matrix,  and  for  convenience 
the  basis  has  been  listed  along  the  top  and  left  sides  of  the  matrix. 

For  each  j  e  {1, . . . ,  m},  ||  Tej  \\2  equals  the  sum  of  the  squares  of  the 
absolute  values  of  the  entries  in  the  j lh  column  of  A  (see  6.25).  Hence 


9.31 


the  sum  of  the  squares  of  the  absolute 
values  of  the  entries  of  A. 


For  each  j  e  {1  ,...,m},  ||r*e/|| 
absolute  values  of  the  entries  in  the 


2  equals  the  sum  of  the  squares  of  the 
j th  rows  of  A  and  B .  Hence 


9.32 


the  sum  of  the  squares  of  the  absolute 
values  of  the  entries  of  A  and  B . 


7  =  1 

Because  T  is  normal,  ||  Tej 


T*ej  ||  for  each  j  (see  7.20);  thus 


m 


7  =  1 


7  =  1 


This  equation,  along  with  9.31  and  9.32,  implies  that  the  sum  of  the  squares 
of  the  absolute  values  of  the  entries  of  B  equals  0.  In  other  words,  B  is  the 
matrix  of  all  0’s.  Thus 
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9.33 


ei 

M(T)  =  e™ 

J  1 

fn 


0 


\ 


c 


/ 


This  representation  shows  that  T fa  is  in  the  span  of  f\ , . . . ,  fn  for  each  k. 
Because  f\ , . . . ,  fn  is  a  basis  of  U^,  this  implies  that  Tv  e  U1-  whenever 
v  G  U ±.  In  other  words,  U1-  is  invariant  under  T,  completing  the  proof  of  (a). 

To  prove  (b),  note  that  Ai(T*),  which  is  the  conjugate  transpose  of  M(T), 
has  a  block  of  0’s  in  the  lower  left  corner  (because  A4(T),  as  given  above,  has 
a  block  of  0’s  in  the  upper  right  corner).  In  other  words,  each  T* ej  can  be 
written  as  a  linear  combination  of  e\ , . . . ,  em.  Thus  U  is  invariant  under  T* 
completing  the  proof  of  (b). 

To  prove  (c),  let  S  =  T\u  G  C(U).  Fix  v  G  U.  Then 


( Su ,  v)  =  ( T u,  v) 

=  {u,T*v) 

for  all  u  G  U.  Because  T*v  e  U  [by  (b)],  the  equation  above  shows  that 
S*v  =  T*v.  In  other  words,  (T |jy)*  =  (T*)\u,  completing  the  proof  of  (c). 

To  prove  (d),  note  that  T  commutes  with  T*  (because  T  is  normal)  and 
that  (T\u)*  =  (T*)\u  [by  (c)].  Thus  T\jj  commutes  with  its  adjoint  and 
hence  is  normal.  Interchanging  the  roles  of  U  and  U -1,  which  is  justified  by 
(a),  shows  that  T\jj±  is  also  normal,  completing  the  proof  of  (d).  ■ 


Our  next  result  shows  that  normal 
operators  on  real  inner  product  spaces 
come  close  to  having  diagonal  matrices. 
Specifically,  we  get  block  diagonal  ma¬ 
trices,  with  each  block  having  size  at 
most  2-by-2. 

We  cannot  expect  to  do  better  than  the  next  result,  because  on  a  real  inner 
product  space  there  exist  normal  operators  that  do  not  have  a  diagonal  matrix 
with  respect  to  any  basis.  For  example,  the  operator  T  G  £(R2)  defined  by 
T (x,  y )  =  (—y,  x)  is  normal  (as  you  should  verify)  but  has  no  eigenvalues; 
thus  this  particular  T  does  not  have  even  an  upper-triangular  matrix  with 
respect  to  any  basis  of  R2. 


Note  that  if  an  operator  T  has  a 
block  diagonal  matrix  with  respect 
to  some  basis,  then  the  entry  in 
each  1  -by- 1  block  on  the  diagonal 
of  this  matrix  is  an  eigenvalue  of  T. 
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9.34  Characterization  of  normal  operators  when  F  =  R 

Suppose  V  is  a  real  inner  product  space  and  T  e  C{V).  Then  the  follow¬ 
ing  are  equivalent: 

(a)  T  is  normal. 

(b)  There  is  an  orthonormal  basis  of  V  with  respect  to  which  T  has  a 
block  diagonal  matrix  such  that  each  block  is  a  1-by-l  matrix  or  a 
2-by-2  matrix  of  the  form 


with  b  >  0. 

Proof  First  suppose  (b)  holds.  With  respect  to  the  basis  given  by  (b),  the 
matrix  of  T  commutes  with  the  matrix  of  T*  (which  is  the  transpose  of  the 
matrix  of  T),  as  you  should  verify  (use  Exercise  9  in  Section  8.B  for  the 
product  of  two  block  diagonal  matrices).  Thus  T  commutes  with  T*  which 
means  that  T  is  normal,  completing  the  proof  that  (b)  implies  (a). 

Now  suppose  (a)  holds,  so  T  is  normal.  We  will  prove  that  (b)  holds 
by  induction  on  dim  V.  To  get  started,  note  that  our  desired  result  holds  if 
dim  V  =  1  (trivially)  or  if  dim  V  =  2  [if  T  is  self-adjoint,  use  the  Real 
Spectral  Theorem  (7.29);  if  T  is  not  self-adjoint,  use  9.27]. 

Now  assume  that  dim  V  >  2  and  that  the  desired  result  holds  on  vector 
spaces  of  smaller  dimension.  Let  U  be  a  subspace  of  V  of  dimension  1  that 
is  invariant  under  T  if  such  a  subspace  exists  (in  other  words,  if  T  has  an 
eigenvector,  let  U  be  the  span  of  this  eigenvector).  If  no  such  subspace  exists, 
let  U  be  a  subspace  of  V  of  dimension  2  that  is  invariant  under  T  (an  invariant 
subspace  of  dimension  1  or  2  always  exists  by  9.8). 

If  dim  U  =  1,  choose  a  vector  in  U  with  norm  1;  this  vector  will  be 
an  orthonormal  basis  of  U,  and  of  course  the  matrix  of  T\u  e  £(£/)  is  a 
1-by-l  matrix.  If  dim  U  —  2,  then  T\u  e  C(JJ)  is  normal  (by  9.30)  but  not 
self-adjoint  (otherwise  T\u,  and  hence  T,  would  have  an  eigenvector  by  7.27). 
Thus  we  can  choose  an  orthonormal  basis  of  U  with  respect  to  which  the 
matrix  of  T  jj  e  C{U)  has  the  required  form  (see  9.27). 

Now  U1-  is  invariant  under  T  and  T\jj±  is  a  normal  operator  on  U1- 
(by  9.30).  Thus  by  our  induction  hypothesis,  there  is  an  orthonormal  basis 
of  U1-  with  respect  to  which  the  matrix  of  T  \  jj±_  has  the  desired  form.  Adjoin¬ 
ing  this  basis  to  the  basis  of  U  gives  an  orthonormal  basis  of  V  with  respect 
to  which  the  matrix  of  T  has  the  desired  form.  Thus  (b)  holds.  ■ 
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Isometries  on  Real  Inner  Product  Spaces 

As  we  will  see,  the  next  example  is  a  key  building  block  for  isometries  on  real 
inner  product  spaces.  Also,  note  that  the  next  example  shows  that  an  isometry 
on  R2  may  have  no  eigenvalues. 

9.35  Example  Let  9  e  R.  Then  the  operator  on  R2  of  counterclockwise 
rotation  (centered  at  the  origin)  by  an  angle  of  9  is  an  isometry,  as  is  geomet¬ 
rically  obvious.  The  matrix  of  this  operator  with  respect  to  the  standard  basis 
is 

cos  9  —  sin  9 
sin  9  cos  9 

If  9  is  not  an  integer  multiple  of  n,  then  no  nonzero  vector  of  R2  gets  mapped 
to  a  scalar  multiple  of  itself,  and  hence  the  operator  has  no  eigenvalues. 

The  next  result  shows  that  every  isometry  on  a  real  inner  product  space  is 
composed  of  pieces  that  are  rotations  on  2-dimensional  subspaces,  pieces  that 
equal  the  identity  operator,  and  pieces  that  equal  multiplication  by  —1. 

9.36  Description  of  isometries  when  F  =  R 

Suppose  V  is  a  real  inner  product  space  and  S  e  £(F).  Then  the  following 
are  equivalent: 

(a)  S  is  an  isometry. 

(b)  There  is  an  orthonormal  basis  of  V  with  respect  to  which  S  has 
a  block  diagonal  matrix  such  that  each  block  on  the  diagonal  is  a 
1  -by- 1  matrix  containing  1  or  —1  or  is  a  2-by-2  matrix  of  the  form 

(cos  9  —  sin  9 
sin  9  cos  9 

with  9  e  (0, 7 r). 

Proof  First  suppose  (a)  holds,  so  S  is  an  isometry.  Because  S  is  normal,  there 
is  an  orthonormal  basis  of  V  with  respect  to  which  S  has  a  block  diagonal 
matrix  such  that  each  block  is  a  1  -by- 1  matrix  or  a  2-by-2  matrix  of  the  form 

9.37 

with  b  >  0  (by  9.34). 
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If  X  is  an  entry  in  a  1-by-l  matrix  along  the  diagonal  of  the  matrix  of  S 
(with  respect  to  the  basis  mentioned  above),  then  there  is  a  basis  vector  ej 
such  that  Sej  =  Xej.  Because  S  is  an  isometry,  this  implies  that  |A|  =  1. 
Thus  A  =  lorA  =  — 1,  because  these  are  the  only  real  numbers  with  absolute 
value  1. 

Now  consider  a  2-by-2  matrix  of  the  form  9.37  along  the  diagonal  of  the 
matrix  of  S .  There  are  basis  vectors  ej,  ej+ \  such  that 


Sej  =  aej  +  bej+ 


Thus 


The  equation  above,  along  with  the  condition  b  >  0,  implies  that  there  exists 
a  number  9  e  (0,  n)  such  that  a  =  cos  6  and  b  =  sin  9.  Thus  the  matrix  9.37 
has  the  required  form,  completing  the  proof  in  this  direction. 

Conversely,  now  suppose  (b)  holds,  so  there  is  an  orthonormal  basis  of  V 
with  respect  to  which  the  matrix  of  S  has  the  form  required  by  the  theorem. 
Thus  there  is  a  direct  sum  decomposition 


V  —  U  i  ©  •  •  •  ©  Um , 


where  each  Uj  is  a  subspace  of  V  of  dimension  1  or  2.  Furthermore,  any  two 
vectors  belonging  to  distinct  C/’s  are  orthogonal,  and  each  S  \ uj  is  an  isometry 
mapping  Uj  into  Uj .  If  v  e  V,  we  can  write 

v  =  u  !+•••  +  um, 

where  each  uj  is  in  Uj.  Applying  S  to  the  equation  above  and  then  taking 
norms  gives 

l|Sv||2  = 


Thus  S  is  an  isometry,  and  hence  (a)  holds. 
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EXERCISES  9.B 


1  Suppose  S  e  £(R3)  is  an  isometry.  Prove  that  there  exists  a  nonzero 
vector  x  e  R3  such  that  S2x  =  x. 

2  Prove  that  every  isometry  on  an  odd-dimensional  real  inner  product  space 
has  1  or  —1  as  an  eigenvalue. 

3  Suppose  V  is  a  real  inner  product  space.  Show  that 

(u  +  iv,  x  +  iy)  =  (u,  x)  +  (v,  y)  +  ((v,  x)  -  (u,  y))i 

for  u,  v,  x,  y  e  V  defines  a  complex  inner  product  on  Vq. 

4  Suppose  V  is  a  real  inner  product  space  and  T  e  C(V)  is  self-adjoint. 
Show  that  7c  is  a  self-adjoint  operator  on  the  inner  product  space  Vq 
defined  by  the  previous  exercise. 

5  Use  the  previous  exercise  to  give  a  proof  of  the  Real  Spectral  Theorem 
(7.29)  via  complexification  and  the  Complex  Spectral  Theorem  (7.24). 

6  Give  an  example  of  an  operator  T  on  an  inner  product  space  such  that  T 
has  an  invariant  subspace  whose  orthogonal  complement  is  not  invariant 
under  T. 

[ The  exercise  above  shows  that  9.30  can  fail  without  the  hypothesis  that 
T  is  normal .] 

7  Suppose  T  e  C(V)  and  T  has  a  block  diagonal  matrix 

Ai  0  \ 

\  0  J 

with  respect  to  some  basis  of  V.  For  j  =  1, . . . ,  m,  let  Tj  be  the  operator 
on  V  whose  matrix  with  respect  to  the  same  basis  is  a  block  diagonal 
matrix  with  blocks  the  same  size  as  in  the  matrix  above,  with  A  j  in  the 
j th  block,  and  with  all  the  other  blocks  on  the  diagonal  equal  to  identity 
matrices  (of  the  appropriate  size).  Prove  that  T  =  T\  •  •  •  Tm. 

8  Suppose  D  is  the  differentiation  operator  on  the  vector  space  V  in 
Exercise  21  in  Section  7.A.  Find  an  orthonormal  basis  of  V  such  that 
the  matrix  of  the  normal  operator  D  has  the  form  promised  by  9.34. 
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Trace  and  Determinant 


Throughout  this  book  our  emphasis  has  been  on  linear  maps  and  operators 
rather  than  on  matrices.  In  this  chapter  we  pay  more  attention  to  matrices  as 
we  define  the  trace  and  determinant  of  an  operator  and  then  connect  these 
notions  to  the  corresponding  notions  for  matrices.  The  book  concludes  with 
an  explanation  of  the  important  role  played  by  determinants  in  the  theory  of 
volume  and  integration. 

Our  assumptions  for  this  chapter  are  as  follows: 

10.1  Notation  F,  V 

•  F  denotes  R  or  C. 

•  V  denotes  a  finite-dimensional  nonzero  vector  space  over  F. 
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Trace 


For  our  study  of  the  trace  and  determinant,  we  will  need  to  know  how  the 
matrix  of  an  operator  changes  with  a  change  of  basis.  Thus  we  begin  this 
chapter  by  developing  the  necessary  material  about  change  of  basis. 

Change  of  Basis 

With  respect  to  every  basis  of  V,  the  matrix  of  the  identity  operator  I  e  C(V) 
is  the  diagonal  matrix  with  l’s  on  the  diagonal  and  0’s  elsewhere.  We  also  use 
the  symbol  I  for  the  name  of  this  matrix,  as  shown  in  the  next  definition. 

10.2  Definition  identity  matrix ,  I 

Suppose  n  is  a  positive  integer.  The  n-by-n  diagonal  matrix 


1 

• 

0 ) 

u 

• 

1 ) 

is  called  the  identity  matrix  and  is  denoted  I. 

Note  that  we  use  the  symbol  I  to  denote  the  identity  operator  (on  all  vector 
spaces)  and  the  identity  matrix  (of  all  possible  sizes).  You  should  always  be 
able  to  tell  from  the  context  which  particular  meaning  of  I  is  intended.  For 
example,  consider  the  equation  M(I)  =  /;  on  the  left  side  I  denotes  the 
identity  operator,  and  on  the  right  side  I  denotes  the  identity  matrix. 

If  A  is  a  square  matrix  (with  entries  in  F,  as  usual)  with  the  same  size  as  /, 
then  A I  =  I A  =  A,  as  you  should  verify. 

10.3  Definition  invertible ,  inverse ,  A -1 

A  square  matrix  A  is  called  invertible  if  there  is  a  square  matrix  B  of 
the  same  size  such  that  AB  =  BA  =  /;  we  call  B  the  inverse  of  A  and 
denote  it  by  A-1. 


10.A 


Some  mathematicians  use  the 
terms  nonsingular,  which  means 
the  same  as  invertible,  and 
singular,  which  means  the  same 
as  noninvertible. 


The  same  proof  as  used  in  3.54 
shows  that  if  A  is  an  invertible  square 
matrix,  then  there  is  a  unique  matrix  B 
such  that  AB  —  BA  —  I  (and  thus  the 
notation  B  =  A~l  is  justified). 
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In  Section  3.C  we  defined  the  matrix  of  a  linear  map  from  one  vector  space 
to  another  with  respect  to  two  bases — one  basis  of  the  first  vector  space  and 
another  basis  of  the  second  vector  space.  When  we  study  operators,  which  are 
linear  maps  from  a  vector  space  to  itself,  we  almost  always  use  the  same  basis 
for  both  vector  spaces  (after  all,  the  two  vector  spaces  in  question  are  equal). 
Thus  we  usually  refer  to  the  matrix  of  an  operator  with  respect  to  a  basis  and 
display  at  most  one  basis  because  we  are  using  one  basis  in  two  capacities. 

The  next  result  is  one  of  the  unusual  cases  in  which  we  use  two  different 
bases  even  though  we  have  operators  from  a  vector  space  to  itself.  It  is  just  a 
convenient  restatement  of  3.43  (with  U  and  W  both  equal  to  V),  but  now  we 
are  being  more  careful  to  include  the  various  bases  explicitly  in  the  notation. 
The  result  below  holds  because  we  defined  matrix  multiplication  to  make  it 
true — see  3.43  and  the  material  preceding  it. 

10.4  The  matrix  of  the  product  of  linear  maps 

Suppose  u\ , . . . ,  un  and  v\ , . . . ,  vn  and  w\ , . . . ,  wn  are  all  bases  of  V. 
Suppose  S,T  g  C(V).  Then 

M(ST,  (ui, . . .  ,u„),  (wi, . . .  ,wnj)  = 

M(S,  (vi, . . . ,  v„),  (w i, . .  .,wn))M(T,  (mi,  . . .  ,un),  (vi, . . . ,  vn)). 

The  next  result  deals  with  the  matrix  of  the  identity  operator  I  with 
respect  to  two  different  bases.  Note  that  the  kth  column  of  the  matrix 
M(l,  (u i, . . . ,  un ),  (vi, . . . ,  vn ))  consists  of  the  scalars  needed  to  write  u ^ 
as  a  linear  combination  of  v\ , . . . ,  vn . 

10.5  Matrix  of  the  identity  with  respect  to  two  bases 

Suppose  u\, ...  ,un  and  v\ , . . . , vn  are  bases  of  V.  Then  the  matrices 
M(l,  (wi,...,M„),(vi,...,v„))  and  M(l,  (vi, . . . ,  v„),  (ui, . . . ,  un)) 
are  invertible,  and  each  is  the  inverse  of  the  other. 

Proof  In  10.4,  replace  Wj  with  Uj,  and  replace  S  and  T  with  /,  getting 
I  =  M(I,(V  1,  .  .  .  ,  Vn),  (Ml,  ,  Un))M(l,  (Ml,  ...  ,  M„),  (vi ,  .  .  .  ,  Vn)). 
Now  interchange  the  roles  of  the  u’s  and  v’s,  getting 
I  =  M(I,  (mi,  . . .  ,u„),  (vi, . . . ,  v„))M(l,  (vi, . . . ,  v„),  (n  i , . . 


These  two  equations  give  the  desired  result. 
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10.6  Example  Consider  the  bases  (4,  2),  (5,  3)  and  (1,0),  (0, 1)  of  F2. 
Obviously 

m(i,  ((4,  2),  (5,  3)),  ((1,0),  (0,1)))  =  f  2  3  )■ 

because  7(4, 2)  =  4(1,0)  +  2(0, 1)  and  7(5,  3)  =  5(1,0)  +  3(0, 1). 

The  inverse  of  the  matrix  above  is 

3  _ 5  \ 

2  2  1 

-i  2  )■ 

as  you  should  verify.  Thus  10.5  implies  that 

A<(/,  ((1,0),  (0,1)),  ((4,2),  (5,3)))  = 


Now  we  can  see  how  the  matrix  of  T  changes  when  we  change  bases.  In 
the  result  below,  we  have  two  different  bases  of  V.  Recall  that  the  notation 
M{T,(u i, . . . ,  un ))  is  shorthand  for  M(T,  (u\,  . . . ,  un),  (u i, . . . ,  un )) 

io.7  Change  of  basis  formula 

Suppose  T  g  C(V).  Let  u\, . . . ,  un  and  vi, . . . ,  vn  be  bases  of  V.  Let 
A  =  M(I,  (mi,...,mw),(vi,...,vw)).  Then 

M(T,  (mi,  . . .  ,MW))  =  ^_1A4(r,  (vi, . . . ,  V„))i4. 

Proof  In  10.4,  replace  Wj  with  uj  and  replace  S  with  7,  getting 
10.8  M(T,  (mi,  . . . ,  mw))  =  ^_1A4(r,  (mi,  . . . ,  un ),  (v1? . . . ,  v„)), 
where  we  have  used  10.5. 

Again  use  10.4,  this  time  replacing  wj  with  vy .  Also  replace  T  with  7  and 
replace  S  with  T,  getting 

M(T,  (mi,  . . . ,  un),  (vi, . . . ,  v„))  =  A4(T,  (vi, . . . ,  v„))A. 


Substituting  the  equation  above  into  10.8  gives  the  desired  result. 
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Trace:  A  Connection  Between  Operators  and  Matrices 

Suppose  T  g  C(V)  and  X  is  an  eigenvalue  of  T.  Let  n  =  dim  V.  Re¬ 
call  that  we  defined  the  multiplicity  of  X  to  be  the  dimension  of  the  gen¬ 
eralized  eigenspace  G(X,T )  (see  8.24)  and  that  this  multiplicity  equals 
dimnull(r  —  XI)n  (see  8.11).  Recall  also  that  if  V  is  a  complex  vector 
space,  then  the  sum  of  the  multiplicities  of  all  the  eigenvalues  of  T  equals  n 
(see  8.26). 

In  the  definition  below,  the  sum  of  the  eigenvalues  “with  each  eigenvalue 
repeated  according  to  its  multiplicity”  means  that  if  X\ , . . . ,  Xm  are  the  distinct 
eigenvalues  of  T  (or  of  Tq  if  V  is  a  real  vector  space)  with  multiplicities 
d\ , . . . ,  dm,  then  the  sum  is 


d\X\  +  •  •  •  +  dmXm. 

Or  if  you  prefer  to  list  the  eigenvalues  with  each  repeated  according  to  its 
multiplicity,  then  the  eigenvalues  could  be  denoted  X\, ...  ,Xn  (where  the 
index  n  equals  dim  V)  and  the  sum  is 

X\  +  •  •  •  +  Xn. 

10.9  Definition  trace  of  an  operator 
Suppose  T  g  C(V). 

•  If  F  =  C,  then  the  trace  of  T  is  the  sum  of  the  eigenvalues  of  T, 
with  each  eigenvalue  repeated  according  to  its  multiplicity. 

•  If  F  =  R,  then  the  trace  of  T  is  the  sum  of  the  eigenvalues  of  7c, 
with  each  eigenvalue  repeated  according  to  its  multiplicity. 

The  trace  of  T  is  denoted  by  trace  T. 


10.10  Example  Suppose  T  g  £(C3)  is  the  operator  whose  matrix  is 


Then  the  eigenvalues  of  T  are  1,  2  +  3/,  and  2  —  3/,  each  with  multiplicity  1, 
as  you  can  verify.  Computing  the  sum  of  the  eigenvalues,  we  find  that 
trace  T  =  1  +  (2  +  3/)  +  (2  —  3/);  in  other  words,  trace  T  —  5. 
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The  trace  has  a  close  connection  with  the  characteristic  polynomial.  Sup¬ 
pose  Ai , . . . ,  Xn  are  the  eigenvalues  of  T  (or  of  Tq  if  V  is  a  real  vector  space) 
with  each  eigenvalue  repeated  according  to  its  multiplicity.  Then  by  definition 
(see  8.34  and  9.21),  the  characteristic  polynomial  of  T  equals 


(z  -  Ai)---(z  -  Xn). 


Expanding  the  polynomial  above,  we  can  write  the  characteristic  polynomial 
of  T  in  the  form 

1 0.1 1  zn  —  (Ai  +  •  •  •  +  A n)zn  i  +  •  •  •  +  (— 1)W (Ai  •  •  •  A n). 

The  expression  above  immediately  leads  to  the  following  result. 

10.12  Trace  and  characteristic  polynomial 

Suppose  T  e  C(V).  Let  n  =  dim  V.  Then  trace  T  equals  the  negative  of 
the  coefficient  of  zn~x  in  the  characteristic  polynomial  of  T. 

Most  of  the  rest  of  this  section  is  devoted  to  discovering  how  to  compute 
trace  T  from  the  matrix  of  T  (with  respect  to  an  arbitrary  basis). 

Let’s  start  with  the  easiest  situation.  Suppose  V  is  a  complex  vector  space, 
T  e  and  we  choose  a  basis  of  V  as  in  8.29.  With  respect  to  that  basis, 

T  has  an  upper-triangular  matrix  with  the  diagonal  of  the  matrix  containing 
precisely  the  eigenvalues  of  T,  each  repeated  according  to  its  multiplicity. 
Thus  trace  T  equals  the  sum  of  the  diagonal  entries  of  M(T)  with  respect  to 
that  basis. 

The  same  formula  works  for  the  operator  T  e  C(C3)  in  Example  10.10 
whose  trace  equals  5.  In  that  example,  the  matrix  is  not  in  upper-triangular 
form.  However,  the  sum  of  the  diagonal  entries  of  the  matrix  in  that  example 
equals  5,  which  is  the  trace  of  the  operator  T. 

At  this  point  you  should  suspect  that  trace  T  equals  the  sum  of  the  diagonal 
entries  of  the  matrix  of  T  with  respect  to  an  arbitrary  basis.  Remarkably,  this 
suspicion  turns  out  to  be  true.  To  prove  it,  we  start  by  making  the  following 
definition. 

10.13  Definition  trace  of  a  matrix 

The  trace  of  a  square  matrix  A,  denoted  trace  A,  is  defined  to  be  the  sum 
of  the  diagonal  entries  of  A. 
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Now  we  have  defined  the  trace  of  an  operator  and  the  trace  of  a  square 
matrix,  using  the  same  word  “trace”  in  two  different  contexts.  This  would  be 
bad  terminology  unless  the  two  concepts  turn  out  to  be  essentially  the  same. 
As  we  will  see,  it  is  indeed  true  that  trace  T  =  trace  M (T,  (vi, . . . ,  vn)), 
where  v\ , . . . ,  vn  is  an  arbitrary  basis  of  V.  We  will  need  the  following  result 
for  the  proof. 


10.14  Trace  of  AB  equals  trace  of  BA 

If  A  and  B  are  square  matrices  of  the  same  size,  then 

trace  (A  B)  =  trace  (B  A). 


Proof  Suppose 


A  = 


A  i 


\  An,  1 


A  \m  ^ 


An,n  / 


B  = 


/ 


V  Bn,  1 


•  •  • 


Bn,n  / 


The  j th  term  on  the  diagonal  of  AB  equals 


n 

Ajtk  Bkj  ■ 

k—  1 

Thus 

n  n 

trace  (/IS)  =  Ai,kBk,j 

j =\ k=\ 
n  n 

=  BkJAj,k 

k=l 7=1 
n 

=  kih  term  on  the  diagonal  of  BA 
k—  1 

=  trace(BA), 


as  desired.  ■ 

Now  we  can  prove  that  the  sum  of  the  diagonal  entries  of  the  matrix  of 
an  operator  is  independent  of  the  basis  with  respect  to  which  the  matrix  is 
computed. 
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10.15  Trace  of  matrix  of  operator  does  not  depend  on  basis 
Let  T  e  C(V).  Suppose  u\ , . . . ,  un  and  v\ , . . . ,  vn  are  bases  of  V.  Then 

trace M(T,  (u i,  . . . ,  un ))  =  trace M(T,  (v i,  . . . ,  vw)). 

Proof  Let  A  =  M (/,  (wi,  . . . ,  ww),  (vi, . . . ,  vw)).  Then 


trace  M (T,  (u \ , . . . ,  ww))  =  trace^A  1  (.M  (T,  (v\ , . . . ,  vn 

=  trace  ^  (Ad  (r,  (vi, . . . ,  vn))A)A~1^ 
=  trace M{T,  (vi,  . . . ,  vw)), 


where  the  first  equality  comes  from  10.7  and  the  second  equality  follows 
from  10.14.  The  third  equality  completes  the  proof.  ■ 


The  result  below,  which  is  the  most  important  result  in  this  section,  states 
that  the  trace  of  an  operator  equals  the  sum  of  the  diagonal  entries  of  the 
matrix  of  the  operator.  This  theorem  does  not  specify  a  basis  because,  by  the 
result  above,  the  sum  of  the  diagonal  entries  of  the  matrix  of  an  operator  is 
the  same  for  every  choice  of  basis. 


10.16  Trace  of  an  operator  equals  trace  of  its  matrix 

Suppose  T  e  C(V).  Then  tracer  =  trace  M(T). 


Proof  As  noted  above,  trace  M  (T)  is  independent  of  which  basis  of  V  we 
choose  (by  10.15).  Thus  to  show  that 

tracer  =  trace  M  (T) 

for  every  basis  of  V,  we  need  only  show  that  the  equation  above  holds  for 
some  basis  of  V. 

As  we  have  already  discussed,  if  V  is  a  complex  vector  space,  then  choos¬ 
ing  the  basis  as  in  8.29  gives  the  desired  result.  If  V  is  a  real  vector  space, 
then  applying  the  complex  case  to  the  complexification  Tq  (which  is  used  to 
define  trace  T)  gives  the  desired  result.  ■ 

If  we  know  the  matrix  of  an  operator  on  a  complex  vector  space,  the  result 
above  allows  us  to  find  the  sum  of  all  the  eigenvalues  without  finding  any  of 
the  eigenvalues,  as  shown  by  the  next  example. 
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10.17  Example  Consider  the  operator  on  C5  whose  matrix  is 

/  0  0  0  0  -3  \ 

1  0  0  0  6 

0  10  0  0 
0  0  10  0 

\  o  o  o  i  o  / 

No  one  can  find  an  exact  formula  for  any  of  the  eigenvalues  of  this  operator. 
However,  we  do  know  that  the  sum  of  the  eigenvalues  equals  0,  because  the 
sum  of  the  diagonal  entries  of  the  matrix  above  equals  0. 

We  can  use  10.16  to  give  easy  proofs  of  some  useful  properties  about 
traces  of  operators  by  shifting  to  the  language  of  traces  of  matrices,  where 
certain  properties  have  already  been  proved  or  are  obvious.  The  proof  of  the 
next  result  is  an  example  of  this  technique.  The  eigenvalues  of  S  +  T  are  not, 
in  general,  formed  from  adding  together  eigenvalues  of  S  and  eigenvalues  of 
T.  Thus  the  next  result  would  be  difficult  to  prove  without  using  10.16. 

10.18  Trace  is  additive 

Suppose  S,T  g  C(V).  Then  trace(*S  +  T)  =  trace  S  +  trace  T. 

Proof  Choose  a  basis  of  V.  Then 

trace(*S  +  T)  =  trace  M.  (S  +  T ) 

=  trace  (.M  (S')  +  M(T )) 

=  trace  M  (S)  +  trace  M  (T) 

=  trace  S  +  trace  T , 

where  again  the  first  and  last  equalities  come  from  10.16;  the  third  equality  is 
obvious  from  the  definition  of  the  trace  of  a  matrix.  ■ 

The  techniques  we  have  developed 
have  the  following  curious  consequence. 

A  generalization  of  this  result  to  infinite¬ 
dimensional  vector  spaces  has  impor¬ 
tant  consequences  in  modern  physics, 
particularly  in  quantum  theory. 


The  statement  of  the  next  result 
does  not  involve  traces,  although 
the  short  proof  uses  traces.  When¬ 
ever  something  like  this  happens  in 
mathematics,  we  can  be  sure  that 
a  good  definition  lurks  in  the  back¬ 
ground. 
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10.19  The  identity  is  not  the  difference  of  ST  and  TS 
There  do  not  exist  operators  S,T  e  £(V)  such  that  ST  —  TS  =  I. 

Proof  Suppose  S,T  e  £(V).  Choose  a  basis  of  V.  Then 

trace(ST  —  TS)  =  trace(ST)  —  trace(7\S) 

=  trace  Ai(ST)  —  trace  M  (T  S) 

=  \x&ct(M(S)M(T))  —  trace(.M  (T)M(S)) 

=  0, 

where  the  first  equality  comes  from  10.18,  the  second  equality  comes  from 
10.16,  the  third  equality  comes  from  3.43,  and  the  fourth  equality  comes  from 
10.14.  Clearly  the  trace  of  I  equals  dim  V,  which  is  not  0.  Because  ST  —  TS 
and  I  have  different  traces,  they  cannot  be  equal.  ■ 


EXERCISES  10. A 


1  Suppose  T  e  C(V)  and  v\ , . . . ,  vn  is  a  basis  of  V.  Prove  that  the  matrix 
M(T,{vl,...,vn))  is  invertible  if  and  only  if  T  is  invertible. 

2  Suppose  A  and  B  are  square  matrices  of  the  same  size  and  AB  =  /. 
Prove  that  BA  =  I. 

3  Suppose  T  e  C(V)  has  the  same  matrix  with  respect  to  every  basis  of  V. 
Prove  that  T  is  a  scalar  multiple  of  the  identity  operator. 

4  Suppose  u\ , . . . ,  un  and  vi , . . . ,  vn  are  bases  of  V.  Let  T  e  C(V)  be  the 
operator  such  that  Tv^  =  for  k  =  1, . . . ,  n.  Prove  that 

M(T,  (vi, . . . ,  v„))  =  M(l,  (u\, . ,  un),  (vi, . . . ,  vw)). 


5  Suppose  B  is  a  square  matrix  with  complex  entries.  Prove  that  there 
exists  an  invertible  square  matrix  A  with  complex  entries  such  that 
A~l  BA  is  an  upper-triangular  matrix. 

6  Give  an  example  of  a  real  vector  space  V  and  T  e  C(V)  such  that 
trace(T2 3 4 5 6 7)  <  0. 

7  Suppose  V  is  a  real  vector  space,  T  e  £(L),  and  V  has  a  basis  consisting 
of  eigenvectors  of  T.  Prove  that  trace(T2)  >  0. 
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8 

9 


11 

12 

13 


14 

15 

16 

17 

18 


Suppose  F  is  an  inner  product  space  and  v,  w  G  F  Define  T  g  C(V )  by 
Tu  —  (u,v)w.  Find  a  formula  for  trace  T. 

Suppose  P  G  C(V)  satisfies  P2  =  P .  Prove  that 

trace  P  =  dim  range  P. 


Suppose  F  is  an  inner  product  space  and  T  G  £(F).  Prove  that 


trace  F*  =  trace  T . 


Suppose  F  is  an  inner  product  space.  Suppose  T  G  £(F)  is  a  positive 
operator  and  trace  T  =  0.  Prove  that  T  =  0. 

Suppose  F  is  an  inner  product  space  and  P,  Q  G  £(F)  are  orthogonal 
projections.  Prove  that  trace (P Q)  >  0. 

Suppose  T  e  C(C3)  is  the  operator  whose  matrix  is 


/  51  -12  -21  \ 

60  -40  -28  . 

\  57  -68  1  ) 


Someone  tells  you  (accurately)  that  —48  and  24  are  eigenvalues  of  T. 
Without  using  a  computer  or  writing  anything  down,  find  the  third  eigen¬ 
value  of  T. 

Suppose  T  g  C{V)  and  c  G  F.  Prove  that  trace(c'T)  =  c  trace  T. 

Suppose  S,T  g  £(F).  Prove  that  trace(ST)  =  trace(TW). 

Prove  or  give  a  counterexample:  if  S,  T  G  C(V),  then  trace(ST)  = 
(trace  S)  (trace  T). 

Suppose  T  g  C(V)  is  such  that  trace(ST)  =  0  for  all  S  G  jC(V).  Prove 
that  T  —  0. 


Suppose  F  is  an  inner  product  space  with  orthonormal  basis  e\ , . . . ,  en 
and  T  G  C{V).  Prove  that 


trac  e(F*r) 


H - h  Te 


n 


Conclude  that  the  right  side  of  the  equation  above  is  independent  of 
which  orthonormal  basis  e\ , . . . ,  en  is  chosen  for  F 
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19  Suppose  V  is  an  inner  product  space.  Prove  that 

(S,T)  =  trace(ST*) 


defines  an  inner  product  on  C(V). 


Suppose  V  is  a  complex  inner  product  space  and  T 
X\ , . . . ,  Xn  be  the  eigenvalues  of  T,  repeated  according 
Suppose 

(  A i,i  . . .  A\,n  \ 

•  • 

•  • 

•  • 

\  . . .  An^n  J 


G  C(V).  Let 
to  multiplicity. 


is  the  matrix  of  T  with  respect  to  some  orthonormal  basis  of  V.  Prove 
that 


n  n 


1-^1 1  H - 1-  |AW|  <  2_^  \Aj,k 

k  =  \  .7  =  1 


21  Suppose  V  is  an  inner  product  space.  Suppose  T  G  C(V)  and 


<  \\Tv\\ 


for  every  v  G  V.  Prove  that  T  is  normal. 

[ The  exercise  above  fails  on  infinite-dimensional  inner  product  spaces , 
leading  to  what  are  called  hyponormal  operators ,  which  have  a  well- 
developed  theory.] 
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Determinant 


Determinant  of  an  Operator 

Now  we  are  ready  to  define  the  determinant  of  an  operator.  Notice  that  the 
definition  below  mimics  the  approach  we  took  when  defining  the  trace,  with 
the  product  of  the  eigenvalues  replacing  the  sum  of  the  eigenvalues. 

1 0.20  Definition  determinant  of  an  operator ;  det  T 
Suppose  T  g  C(V). 

•  If  F  =  C,  then  the  determinant  of  T  is  the  product  of  the  eigenvalues 
of  r,  with  each  eigenvalue  repeated  according  to  its  multiplicity. 

•  If  F  =  R,  then  the  determinant  of  T  is  the  product  of  the  eigenvalues 
of  7c ,  with  each  eigenvalue  repeated  according  to  its  multiplicity. 

The  determinant  of  T  is  denoted  by  det  T. 

If  Ai, . . . ,  Xm  are  the  distinct  eigenvalues  of  T  (or  of  Tq  if  V  is  a  real 
vector  space)  with  multiplicities  d\ , . . . ,  dm,  then  the  definition  above  implies 

det  T  =  Xf 1  •  •  •  A^m . 

Or  if  you  prefer  to  list  the  eigenvalues  with  each  repeated  according  to  its 
multiplicity,  then  the  eigenvalues  could  be  denoted  X\, ...  ,Xn  (where  the 
index  n  equals  dim  V)  and  the  definition  above  implies 

det  T  —  X  i  •  •  •  Xn . 


10.B 


10.21  Example  Suppose  T  G  £(C3)  is  the  operator  whose  matrix  is 


Then  the  eigenvalues  of  T  are  1,  2  +  3/,  and  2  —  3/,  each  with  multiplicity  1, 
as  you  can  verify.  Computing  the  product  of  the  eigenvalues,  we  find  that 
det  T  =  1  •  (2  +  3/)  •  (2  —  3/);  in  other  words,  det  T  =  13. 
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The  determinant  has  a  close  connection  with  the  characteristic  polynomial. 
Suppose  Ai , . . . ,  Xn  are  the  eigenvalues  of  T  (or  of  Tq  if  V  is  a  real  vector 
space)  with  each  eigenvalue  repeated  according  to  its  multiplicity.  Then  the 
expression  for  the  characteristic  polynomial  of  T  given  by  10.11  gives  the 
following  result. 

10.22  Determinant  and  characteristic  polynomial 

Suppose  T  G  C(V).  Let  n  =  dim  V.  Then  det  T  equals  (—  \)n  times  the 

constant  term  of  the  characteristic  polynomial  of  T. 

Combining  the  result  above  and  10.12,  we  have  the  following  result. 

10.23  Characteristic  polynomial,  trace,  and  determinant 

Suppose  T  G  C(V).  Then  the  characteristic  polynomial  of  T  can  be 

written  as 

zn  —  (trace  T)zn~l  +  •  •  •  +  (— l)w(det  T ). 

We  turn  now  to  some  simple  but  important  properties  of  determinants. 
Later  we  will  discover  how  to  calculate  det  T  from  the  matrix  of  T  (with 
respect  to  an  arbitrary  basis). 

The  crucial  result  below  has  an  easy  proof  due  to  our  definition. 

10.24  Invertible  is  equivalent  to  nonzero  determinant 

An  operator  on  V  is  invertible  if  and  only  if  its  determinant  is  nonzero. 

Proof  First  suppose  V  is  a  complex  vector  space  and  T  G  C(V).  The 
operator  T  is  invertible  if  and  only  if  0  is  not  an  eigenvalue  of  T.  Clearly  this 
happens  if  and  only  if  the  product  of  the  eigenvalues  of  T  is  not  0.  Thus  T  is 
invertible  if  and  only  if  det  T  ^  0,  as  desired. 

Now  consider  the  case  where  V  is  a  real  vector  space  and  T  G  C(V). 
Again,  T  is  invertible  if  and  only  if  0  is  not  an  eigenvalue  of  T,  which  happens 
if  and  only  if  0  is  not  an  eigenvalue  of  Tq  (because  Tq  and  T  have  the  same 
real  eigenvalues  by  9.11).  Thus  again  we  see  that  T  is  invertible  if  and  only  if 
det  7V  0.  ■ 

Some  textbooks  take  the  result  below  as  the  definition  of  the  characteristic 
polynomial  and  then  have  our  definition  of  the  characteristic  polynomial  as  a 
consequence. 
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10.25  Characteristic  polynomial  of  T  equals  det (zl  -  T ) 

Suppose  T  e  C(V).  Then  the  characteristic  polynomial  of  T  equals 
det  (zl  —  T ). 

Proof  First  suppose  V  is  a  complex  vector  space.  If  A,  z  e  C,  then  A  is  an 
eigenvalue  of  T  if  and  only  if  z  —  X  is  an  eigenvalue  of  zl  —  T ,  as  can  be  seen 
from  the  equation 


-(T  -  XI)  =  (z/  —  T)  —  (z  —  X)L 

Raising  both  sides  of  this  equation  to  the  dim  V  power  and  then  taking  null 
spaces  of  both  sides  shows  that  the  multiplicity  of  X  as  an  eigenvalue  of  T 
equals  the  multiplicity  of  z  —  X  as  an  eigenvalue  of  zl  —  T. 

Let  Xi, ...  ,Xn  denote  the  eigenvalues  of  T,  repeated  according  to  mul¬ 
tiplicity.  Thus  for  z  e  C,  the  paragraph  above  shows  that  the  eigenvalues 
of  zl  —  T  are  z  —  Ai, . . . ,  z  —  Aw,  repeated  according  to  multiplicity.  The 
determinant  of  zl  —  T  is  the  product  of  these  eigenvalues.  In  other  words, 

det  {zl  —  T)  =  (z  —  X\)  •  •  •  (z  —  Xn). 

The  right  side  of  the  equation  above  is,  by  definition,  the  characteristic  poly¬ 
nomial  of  T,  completing  the  proof  when  V  is  a  complex  vector  space. 

Now  suppose  V  is  a  real  vector  space.  Applying  the  complex  case  to  Tq 
gives  the  desired  result.  ■ 

Determinant  of  a  Matrix 

Our  next  task  is  to  discover  how  to  compute  det  T  from  the  matrix  of  T  (with 
respect  to  an  arbitrary  basis).  Let’s  start  with  the  easiest  situation.  Suppose 
V  is  a  complex  vector  space,  T  e  C(V),  and  we  choose  a  basis  of  V  as  in 
8.29.  With  respect  to  that  basis,  T  has  an  upper-triangular  matrix  with  the 
diagonal  of  the  matrix  containing  precisely  the  eigenvalues  of  T,  each  repeated 
according  to  its  multiplicity.  Thus  det  T  equals  the  product  of  the  diagonal 
entries  of  M(T)  with  respect  to  that  basis. 

When  dealing  with  the  trace  in  the  previous  section,  we  discovered  that  the 
formula  (trace  =  sum  of  diagonal  entries)  that  worked  for  the  upper-triangular 
matrix  given  by  8.29  also  worked  with  respect  to  an  arbitrary  basis.  Could  that 
also  work  for  determinants?  In  other  words,  is  the  determinant  of  an  operator 
equal  to  the  product  of  the  diagonal  entries  of  the  matrix  of  the  operator  with 
respect  to  an  arbitrary  basis? 
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Unfortunately,  the  determinant  is  more  complicated  than  the  trace.  In  par¬ 
ticular,  det  T  need  not  equal  the  product  of  the  diagonal  entries  of  M(T)  with 
respect  to  an  arbitrary  basis.  For  example,  the  operator  in  Example  10.21  has 
determinant  13  but  the  product  of  the  diagonal  entries  of  its  matrix  equals  0. 

For  each  square  matrix  A,  we  want  to  define  the  determinant  of  A,  denoted 
det  A,  so  that  det  T  =  det  M(T)  regardless  of  which  basis  is  used  to  com¬ 
pute  M(T).  We  begin  our  search  for  the  correct  definition  of  the  determinant 
of  a  matrix  by  calculating  the  determinants  of  some  special  operators. 

10.26  Example  Suppose  a  i, ...  ,an  e  F.  Let 

/  0  CLn  \ 

a\  0 

A  =  «2  o  • 

•  • 

•  • 

•  • 

\  ®n— 1  0  / 

here  all  entries  of  the  matrix  are  0  except  for  the  upper-right  corner  and 
along  the  line  just  below  the  diagonal.  Suppose  v\, ...  ,vn  is  a  basis  of  V  and 
T  G  C(V)  is  such  that  M{T,(v i , . . . ,  vw))  =  A.  Find  the  determinant  of  T. 

Solution  First  assume  aj  ^  0  for  each  j  =  1, . . . ,  n  —  1.  Note  that  the  list 
vi,  Tv\,  T2v i, . . . ,  Tn~lv i  equals  vi,  a\V2,  . . . ,  a\  •  •  •  an- \vn. 

Thus  vi,7vi,...,7T,l“1vi  is  lin¬ 
early  independent  (because  the  a's  are 
all  nonzero).  Hence  if  p  is  a  monic  poly¬ 
nomial  with  degree  at  most  n  —  1,  then 
p(T)v i  7^  0.  Thus  the  minimal  poly¬ 
nomial  of  T  cannot  have  degree  less 
than  n . 

As  you  should  verify,  TnVj  =  a\---anVj  for  each  j .  Thus  we  have 
Tn  =  a i  •  •  •  anI.  Hence  zn  —  a\  •  •  •  an  is  the  minimal  polynomial  of  T.  Be¬ 
cause  n  =  dim  V  and  the  characteristic  polynomial  is  a  polynomial  multiple 
of  the  minimal  polynomial  (9.26),  this  implies  that  zn  —  a  \  •  •  •  an  is  also  the 
characteristic  polynomial  of  T. 

Thus  10.22  implies  that 

detT  =  (—  \)n~la\  •  •  -an. 

If  some  a  j  equals  0,  then  Tv  j  =0  for  some  j ,  which  implies  that  0  is  an 
eigenvalue  of  T  and  hence  det  T  =  0.  In  other  words,  the  formula  above  also 
holds  if  some  aj  equals  0. 


Computing  the  minimal  polynomial 
is  often  an  efficient  method  of  find¬ 
ing  the  characteristic  polynomial, 
as  is  done  in  this  example. 
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Thus  in  order  to  have  det  T  =  det M(T),  we  will  have  to  make  the  deter¬ 
minant  of  the  matrix  in  Example  10.26  equal  to  (—  \)n~la\  •  •  •  an.  However, 
we  do  not  yet  have  enough  evidence  to  make  a  reasonable  guess  about  the 
proper  definition  of  the  determinant  of  an  arbitrary  square  matrix. 

To  compute  the  determinants  of  a  more  complicated  class  of  operators,  we 
introduce  the  notion  of  permutation. 

10.27  Definition  permutation,  perm n 

•  A  permutation  of  (1, . . . ,  n)  is  a  list  (mi, ... ,  mn )  that  contains 
each  of  the  numbers  1, . . . ,  n  exactly  once. 

•  The  set  of  all  permutations  of  (1, . . . ,  n)  is  denoted  permm 

For  example,  (2,  3,  4,  5, 1)  G  perm  5.  You  should  think  of  an  element  of 
perm  n  as  a  rearrangement  of  the  first  n  integers. 


1 0.28  Example  Suppose  a \ , . . . ,  an  G  F  and  vi , . . . ,  vn  is  a  basis  of  V. 
Consider  a  permutation  (p\, . . . ,  pn)  g  perm n  that  can  be  obtained  as  fol¬ 
lows:  break  (1, . . . ,  n)  into  lists  of  consecutive  integers  and  in  each  list  move 
the  first  term  to  the  end  of  that  list.  For  example,  taking  n  —  9,  the  permutation 

(2,3, 1,5,  6,  7,  4,  9,  8) 

is  obtained  from  (1 , 2,  3),  (4,  5,  6,  7),  (8,  9)  by  moving  the  first  term  of  each  of 
these  lists  to  the  end,  producing  (2,  3, 1),  (5,  6, 7,  4),  (9,  8),  and  then  putting 
these  together  to  form  the  permutation  displayed  above. 

Fet  T  g  C(V)  be  the  operator  such  that 

Tvk=  akvPk 

for  k  =  1, . . . ,  n.  Find  det  T. 

Solution  This  generalizes  Example  10.26,  because  if  (p i, . . . ,  pn )  is  the 
permutation  (2,  3, . . . ,  n,  1),  then  our  operator  T  is  the  same  as  the  operator 
T  in  Example  10.26. 

With  respect  to  the  basis  vi , . . . ,  vn,  the  matrix  of  the  operator  T  is  a  block 
diagonal  matrix 

Ai  o  \ 

A  = 

\  0  Am  ) 

where  each  block  is  a  square  matrix  of  the  form  of  the  matrix  in  10.26. 
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Correspondingly,  we  can  write  V  =  V\  ©  •  •  •  ©  Vm  ,  where  each  Vj  is 
invariant  under  T  and  each  T\y.  is  of  the  form  of  the  operator  in  10.26. 
Because  det  T  =  (det  T \ y1 )  •  •  •  (det  T\yM)  (because  the  dimensions  of  the 
generalized  eigenspaces  in  the  Vj  add  up  to  dim  V ),  we  have 

detr  =  (-1  )ni~l 

where  Vj  has  dimension  nj  (and  correspondingly  each  Aj  has  size  rij-by-rij ) 
and  we  have  used  the  result  from  10.26. 


The  number  (—  l)ni~l  •  •  •  (—  l)w^_1  that  appears  above  is  called  the  sign 
of  the  corresponding  permutation  (p i, . . . ,  pn ),  denoted  sign(/?i, . . . ,  pn ) 
[this  is  a  temporary  definition  that  we  will  change  to  an  equivalent  definition 
later,  when  we  define  the  sign  of  an  arbitrary  permutation]. 

To  put  this  into  a  form  that  does  not  depend  on  the  particular  permutation 
(p i , . . . ,  pn),  let  Aj9k  denote  the  entry  in  row  j ,  column  k ,  of  the  matrix  A 
from  Example  10.28.  Thus 


(  0  if  7  #  pk\ 
(  ak  if  j  =  pk. 


Example  10.28  shows  that  we  want 


10.29  det  ,4=  ^  (sign(mi, . . . ,  mnj)Amui  ■  ■  ■  Am„y, 

(mi,...,mw)epermn 


note  that  each  summand  is  0  except  the  one  corresponding  to  the  permutation 
(pi, . . . ,  pn)  [which  is  why  it  does  not  matter  that  the  sign  of  the  other 
permutations  is  not  yet  defined]. 

We  can  now  guess  that  det  A  should  be  defined  by  10.29  for  an  arbitrary 
square  matrix  A.  This  will  turn  out  to  be  correct.  We  will  now  dispense  with 
the  motivation  and  begin  the  more  formal  approach.  First  we  will  need  to 
define  the  sign  of  an  arbitrary  permutation. 


10.30  Definition  sign  of  a  permutation 

•  The  sign  of  a  permutation  (mi, ,  mn)  is  defined  to  be  1  if  the 
number  of  pairs  of  integers  (  /,  k)  with  1  <  j  <  k  <  n  such  that 
j  appears  after  k  in  the  list  (mi, ... , mn)  is  even  and  —1  if  the 
number  of  such  pairs  is  odd. 

•  In  other  words,  the  sign  of  a  permutation  equals  1  if  the  natural 
order  has  been  changed  an  even  number  of  times  and  equals  —1  if 
the  natural  order  has  been  changed  an  odd  number  of  times. 
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10.31  Example  sign  of  permutation 

•  The  only  pair  of  integers  (j,  k )  with  j  <  k  such  that  j  appears  after 
k  in  the  list  (2, 1, 3, 4)  is  (1, 2).  Thus  the  permutation  (2, 1, 3, 4)  has 
sign  —  1 . 

•  In  the  permutation  (2,3, ...  ,n,  1),  the  only  pairs  (j,  k )  with  j  <  k 
that  appear  with  changed  order  are  (1,  2),  (1,  3), . . . ,  (1,  n)\  because  we 
have  n  —  1  such  pairs,  the  sign  of  this  permutation  equals  (—  \)n~l  (note 
that  the  same  quantity  appeared  in  Example  10.26). 

The  next  result  shows  that  interchanging  two  entries  of  a  permutation 
changes  the  sign  of  the  permutation. 

10.32  Interchanging  two  entries  in  a  permutation 

Interchanging  two  entries  in  a  permutation  multiplies  the  sign  of  the 
permutation  by  —1. 

Proof  Suppose  we  have  two  permutations,  where  the  second  permutation  is 
obtained  from  the  first  by  interchanging  two  entries.  If  the  two  interchanged 
entries  were  in  their  natural  order  in  the  first  permutation,  then  they  no  longer 
are  in  the  second  permutation,  and  vice  versa,  for  a  net  change  (so  far)  of  1  or 
—  1  (both  odd  numbers)  in  the  number  of  pairs  not  in  their  natural  order. 

Consider  each  entry  between  the  two 
interchanged  entries.  If  an  intermediate 
entry  was  originally  in  the  natural  order 
with  respect  to  both  interchanged  entries,  then  it  is  now  in  the  natural  order 
with  respect  to  neither  interchanged  entry.  Similarly,  if  an  intermediate  entry 
was  originally  in  the  natural  order  with  respect  to  neither  of  the  interchanged 
entries,  then  it  is  now  in  the  natural  order  with  respect  to  both  interchanged 
entries.  If  an  intermediate  entry  was  originally  in  the  natural  order  with  respect 
to  exactly  one  of  the  interchanged  entries,  then  that  is  still  true.  Thus  the  net 
change  for  each  intermediate  entry  in  the  number  of  pairs  not  in  their  natural 
order  is  2,  —2,  or  0  (all  even  numbers). 

For  all  the  other  entries,  there  is  no  change  in  the  number  of  pairs  not  in 
their  natural  order.  Thus  the  total  net  change  in  the  number  of  pairs  not  in 
their  natural  order  is  an  odd  number.  Thus  the  sign  of  the  second  permutation 
equals  —1  times  the  sign  of  the  first  permutation.  ■ 


Some  texts  use  the  term  signum, 
which  means  the  same  as  sign. 


Our  motivation  for  the  next  definition  comes  from  10.29. 
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1 0.33  Definition  determinant  of  a  matrix ,  det  A 
Suppose  A  is  an  ft-by-ft  matrix 

^  ^4 1,1  •••  ^4l,n  ^ 

A  =  :  : 

•  • 

\  An,  i  . .  •  -4^,^  / 

The  determinant  of  A,  denoted  det  A,  is  defined  by 


det  A  = 


(sign(mi, . . 


(mi,...,mn)epermn 


A 


mn,n 


10.34  Example  determinants 

•  If  ^4  is  the  1-by-l  matrix  [^4i,i],  then  det  ,4  =  A i;i,  because  perm  1 
has  only  one  element,  namely  (1),  which  has  sign  1. 

•  Clearly  perm  2  has  only  two  elements,  namely  (1,2),  which  has  sign  1, 
and  (2, 1),  which  has  sign  —1.  Thus 


det 


-4i,i  ^4 1,2 
-42,1  -42,2 


=  -4l,l-42,2  —  -42,l-4l,2* * 


The  set  perm  3  contains  six  ele¬ 
ments.  In  general,  perm  n  contains 
ft !  elements.  Note  that  n !  rapidly 
grows  large  as  n  increases. 

To  make  sure  you  understand  this 
process,  you  should  now  find  the  for¬ 
mula  for  the  determinant  of  an  arbitrary 
3-by-3  matrix  using  just  the  definition 
given  above. 


1 0.35  Example  Compute  the  determinant  of  an  upper-triangular  matrix 

(  A i?i  *  ^ 

A  = 

9 

\  0  An^n  J 

Solution  The  permutation  (1,  2, . . . ,  n)  has  sign  1  and  thus  contributes  a  term 
of  A\^\  •  •  •  An,n  to  the  sum  defining  det  ,4  in  10.33.  Any  other  permutation 
(mi, ... ,  mn )  G  perm  ft  contains  at  least  one  entry  mj  with  mj  >  j ,  which 
means  that  Amjj  =  0  (because  A  is  upper  triangular).  Thus  all  the  other 
terms  in  the  sum  in  10.33  make  no  contribution. 

Hence  det  A  =  A\,\  •  •  •  An^n.  In  other  words,  the  determinant  of  an  upper- 
triangular  matrix  equals  the  product  of  the  diagonal  entries. 
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Suppose  V  is  a  complex  vector  space,  T  g  C(V),  and  we  choose  a  basis 
of  V  as  in  8.29.  With  respect  to  that  basis,  T  has  an  upper-triangular  matrix 
with  the  diagonal  of  the  matrix  containing  precisely  the  eigenvalues  of  T, 
each  repeated  according  to  its  multiplicity.  Thus  Example  10.35  tells  us  that 
det  T  =  det  M(T),  where  the  matrix  is  with  respect  to  that  basis. 

Our  goal  is  to  prove  that  det  T  =  det  M(T)  for  every  basis  of  V,  not  just 
the  basis  from  8.29.  To  do  this,  we  will  need  to  develop  some  properties  of 
determinants  of  matrices.  The  result  below  is  the  first  of  the  properties  we 
will  need. 

10.36  Interchanging  two  columns  in  a  matrix 

Suppose  A  is  a  square  matrix  and  B  is  the  matrix  obtained  from  A  by 

interchanging  two  columns.  Then 

det  A  =  —  det  B. 

Proof  Think  of  the  sum  defining  det  ,4  in  10.33  and  the  corresponding  sum 
defining  det  B.  The  same  products  of  Aj^’s  appear  in  both  sums,  although 
they  correspond  to  different  permutations.  The  permutation  corresponding  to 
a  given  product  of  Aj^’s  when  computing  det  B  is  obtained  by  interchanging 
two  entries  in  the  corresponding  permutation  when  computing  det  A,  thus 
multiplying  the  sign  of  the  permutation  by  —1  (see  10.32).  Hence  we  see  that 
det  A  =  —  det  B.  m 

If  T  G  C(V)  and  the  matrix  of  T  (with  respect  to  some  basis)  has  two 
equal  columns,  then  T  is  not  injective  and  hence  det  T  =  0.  Although  this 
comment  makes  the  next  result  plausible,  it  cannot  be  used  in  the  proof, 
because  we  do  not  yet  know  that  det  T  =  det  M  (T)  for  every  choice  of  basis. 

10.37  Matrices  with  two  equal  columns 

If  A  is  a  square  matrix  that  has  two  equal  columns,  then  det  A  =  0. 

Proof  Suppose  A  is  a  square  matrix  that  has  two  equal  columns.  Interchang¬ 
ing  the  two  equal  columns  of  A  gives  the  original  matrix  A.  Thus  from  10.36 
(with  B  =  A),  we  have 

det  A  =  —  det  A, 


which  implies  that  det  A  =  0. 
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Recall  from  3.44  that  if  A  is  an  n-by-n  matrix 

Aip  •  •  •  Ai9„  > 

•  • 

•  • 

’ 

An,  1  •  •  •  An, n  / 

then  we  can  think  of  the  km  column  of  A  as  an  n- by-1  matrix  denoted  A. y. 

*  A\,k  ^ 

* 

\  An  k  j 

Note  that  Aj^,  with  two  subscripts,  de¬ 
notes  an  entry  of  A,  whereas  A.^,  with 
a  dot  as  a  placeholder  and  one  subscript, 
denotes  a  column  of  A.  This  notation 
allows  us  to  write  A  in  the  form 

(  A.f i  .  .  .  A.^n  ), 

which  will  be  useful. 

The  next  result  shows  that  a  permutation  of  the  columns  of  a  matrix 
changes  the  determinant  by  a  factor  of  the  sign  of  the  permutation. 

10.38  Permuting  the  columns  of  a  matrix 

Suppose  A  =  (  A?i  ...  A.^n  )  is  an  n-by-n  matrix  and  {m\ , . . . ,  mn ) 
is  a  permutation.  Then 

det(  Ami  ...  A.tmn  )  =  (sign(m  i . «„))  det^. 

Proof  We  can  transform  the  matrix  (  A?mi  . . .  A?m/?  )  into  A  through  a 

series  of  steps.  In  each  step,  we  interchange  two  columns  and  hence  multiply 
the  determinant  by  —1  (see  10.36).  The  number  of  steps  needed  equals  the 
number  of  steps  needed  to  transform  the  permutation  {m\ , . . . ,  mn)  into  the 
permutation  (1, . . . ,  n)  by  interchanging  two  entries  in  each  step.  The  proof 
is  completed  by  noting  that  the  number  of  such  steps  is  even  if  (m\ , . . . ,  mn) 
has  sign  1,  odd  if  (m  i, . . . ,  mn )  has  sign  —1  (this  follows  from  10.32,  along 
with  the  observation  that  the  permutation  (!,...,«)  has  sign  1).  ■ 


Some  books  define  the  determinant 
to  be  the  function  defined  on  the 
square  matrices  that  is  linear  as 
a  function  of  each  column  sepa¬ 
rately  and  that  satisfies  10.38  and 
det  I  =  1.  To  prove  that  such  a 
function  exists  and  that  it  is  unique 
takes  a  nontrivial  amount  of  work. 


A  = 

V 
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The  next  result  about  determinants  will  also  be  useful. 


10.39  Determinant  is  a  linear  function  of  each  column 

Suppose  k,  n  are  positive  integers  with  1  <  k  <  n.  Fix  n- by-1  matrices 
Aj, . . . ,  A.^n  except  A.^.  Then  the  function  that  takes  an  n- by-1  column 
vector  A.9k  to 

det(  A.j  ...  A.yk  ...  A.yTl  ) 

is  a  linear  map  from  the  vector  space  of  n- by-1  matrices  with  entries  in  F 
to  F. 


Proof  The  linearity  follows  easily  from  10.33,  where  each  term  in  the  sum 
contains  precisely  one  entry  from  the  kih  column  of  A.  m 


Now  we  are  ready  to  prove  one  of 
the  key  properties  about  determinants 
of  square  matrices.  This  property  will 
enable  us  to  connect  the  determinant  of 
an  operator  with  the  determinant  of  its 

matrix.  Note  that  this  proof  is  considerably  more  complicated  than  the  proof 
of  the  corresponding  result  about  the  trace  (see  10.14). 


The  result  below  was  first  proved 
in  1812  by  French  mathematicians 
Jacques  Binet  and  Augustin-Louis 
Cauchy. 


10.40  Determinant  is  multiplicative 

Suppose  A  and  B  are  square  matrices  of  the  same  size.  Then 

&ti(AB)  =  det(7L4)  =  (det  A)(det  B). 


Proof  Write  A  =  (  A.f  \  . . .  A.,n  ),  where  each  A. ^  is  an  n- by-1  column 

of  A.  Also  write 


\ 


Bum 


/ 


where  each  B.  ^  is  an  n- by-1  column  of  B.  Let  e ^  denote  the  n- by-1  matrix 
that  equals  1  in  the  kih  row  and  0  elsewhere.  Note  that  Ae ^  —  A.  ^  and 
Bek  =  B.  k.  Furthermore,  B.  k  =  Yl„=\  Bm,kem ■ 

First  we  will  prove  det (AB)  =  (det  A)(det  B).  As  we  observed  ear¬ 
lier  (see  3.49),  the  definition  of  matrix  multiplication  easily  implies  that 
AB  —  (  AB.  \  . . .  AB.^n  ).  Thus 
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det(AB)  =  det(  AB.^\  ...  AB.,n  ) 

—  det(  A(^2mi  =  1  Bmi^emi)  ...  AQ2mn  =  i  ^mn,n^mn)  ) 

—  det(  ,  =  i  Bm , . i Aem ,  ...  ^2mn  =  i  Bm/l,nAem/l  ) 

n  n 

—  ^  '  ***  ^  ,1  *  *  *  Bmn,n  det(  Aemi  ...  Aemn  ), 

m i=l  mn= 1 

where  the  last  equality  comes  from  repeated  applications  of  the  linearity  of  det 
as  a  function  of  one  column  at  a  time  (10.39).  In  the  last  sum  above,  all  terms 
in  which  mj  =  for  some  j  ^  k  can  be  ignored,  because  the  determinant 
of  a  matrix  with  two  equal  columns  is  0  (by  10.37).  Thus  instead  of  summing 
over  all  m\ , . . . ,  mn  with  each  mj  taking  on  values  1, we  can  sum  just 
over  the  permutations,  where  the  mj ’s  have  distinct  values.  In  other  words, 

det (/l B)  =  ^  '  Bm\,i  *  *  *  Bmn,n  det(  Aemi  . . .  Aemn  ) 

(m\  ,...,m/7)€perm« 


52  Bm\,i  ■  ■  ■  Bmn ,n (sign(m  1 , . . .  ,m„))  det  ,4 

(m\,...,mn)  Gperm  n 


=  (det^)  52  (sign(mi,...,rnn))Bmui---Bmn>n 

(mi,...,mn)epQrmn 


=  (det  A)  (det  B), 


where  the  second  equality  comes  from  10.38. 

In  the  paragraph  above,  we  proved  that  det(AB)  =  (det  A)(det  B).  In¬ 
terchanging  the  roles  of  A  and  B ,  we  have  det(BA)  =  (det  B) (det  A).  The 
last  equation  can  be  rewritten  as  det(BA)  =  (det  A) (det  B),  completing  the 
proof.  ■ 


Note  the  similarity  of  the  proof  of 
the  next  result  to  the  proof  of  the 
analogous  result  about  the  trace 
(see  10.15). 


Now  we  can  prove  that  the  determi¬ 
nant  of  the  matrix  of  an  operator  is  in¬ 
dependent  of  the  basis  with  respect  to 
which  the  matrix  is  computed. 


io.4i  Determinant  of  matrix  of  operator  does  not  depend  on  basis 

Let  T  g  C(V).  Suppose  u\ , . . . ,  un  and  v\ , . . . ,  vn  are  bases  of  V.  Then 

det  Ad  (B,  (wi, . . . ,  un ))  =  det  Ad  (B,  (vi, . . . ,  vn)). 
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Proof  Let  A  =  A4(l,  (u  1, . . . ,  un),  (vi, . . . ,  vw)).  Then 

det M(T,  (mi,  . . . ,  u„))  =  . . .  ,vn))A)\ 

=  det^(Ad(r,  (vi, . . . ,  vn))A)A~* 1 * * *^ 

=  det  Ad(r,  (vi, . . . ,  vn)), 

where  the  first  equality  follows  from  10.7  and  the  second  equality  follows 
from  10.40.  The  third  equality  completes  the  proof.  ■ 

The  result  below  states  that  the  determinant  of  an  operator  equals  the 
determinant  of  the  matrix  of  the  operator.  This  theorem  does  not  specify  a 
basis  because,  by  the  result  above,  the  determinant  of  the  matrix  of  an  operator 
is  the  same  for  every  choice  of  basis. 

10.42  Determinant  of  an  operator  equals  determinant  of  its  matrix 
Suppose  T  €  C(V).  Then  det  T  =  det  Ad  (T). 

Proof  As  noted  above,  10.41  implies  that  det  Ad  (T)  is  independent  of  which 
basis  of  V  we  choose.  Thus  to  show  that  det  T  =  det  Ad  (T)  for  every  basis 
of  V,  we  need  only  show  that  the  result  holds  for  some  basis  of  V. 

As  we  have  already  discussed,  if  V  is  a  complex  vector  space,  then  choos¬ 
ing  a  basis  of  V  as  in  8.29  gives  the  desired  result.  If  V  is  a  real  vector  space, 
then  applying  the  complex  case  to  the  complexification  Tq  (which  is  used  to 
define  det  T)  gives  the  desired  result.  ■ 

If  we  know  the  matrix  of  an  operator  on  a  complex  vector  space,  the  result 
above  allows  us  to  find  the  product  of  all  the  eigenvalues  without  finding  any 
of  the  eigenvalues. 

10.43  Example  Suppose  T  is  the  operator  on  C5  whose  matrix  is 

/  0  0  0  0  -3  \ 

1  0  0  0  6 

0  10  0  0 

0  0  10  0 

V  0  0  0  1  0  / 

No  one  knows  an  exact  formula  for  any  of  the  eigenvalues  of  this  operator. 
However,  we  do  know  that  the  product  of  the  eigenvalues  equals  —3,  because 
the  determinant  of  the  matrix  above  equals  —3. 
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We  can  use  10.42  to  give  easy  proofs  of  some  useful  properties  about 
determinants  of  operators  by  shifting  to  the  language  of  determinants  of 
matrices,  where  certain  properties  have  already  been  proved  or  are  obvious. 
We  carry  out  this  procedure  in  the  next  result. 

10.44  Determinant  is  multiplicative 
Suppose  S,T  e  C{V).  Then 

det(ST)  =  det(7\S)  =  (det  S)  (det  T). 

Proof  Choose  a  basis  of  V.  Then 

det(ST)  =  det  M  (ST) 

=  det  (M(S)M(T)) 

=  (det  M  (5))  (det  M  (T)) 

=  (det  S )  (det  T), 

where  the  first  and  last  equalities  come  from  10.42  and  the  third  equality 
comes  from  10.40. 

In  the  paragraph  above,  we  proved  that  det(*S  T)  =  (det  S)  (det  T).  Inter¬ 
changing  the  roles  of  S  and  T,  we  have  det(T S )  =  (det  T)(det  S ).  Because 
multiplication  of  elements  of  F  is  commutative,  the  last  equation  can  be 
rewritten  as  det(T S )  =  (det  S) (det  T),  completing  the  proof.  ■ 

The  Sign  of  the  Determinant 

We  proved  the  basic  results  of  linear  algebra  before  introducing  determinants 
in  this  final  chapter.  Although  determinants  have  value  as  a  research  tool  in 
more  advanced  subjects,  they  play  little  role  in  basic  linear  algebra  (when  the 
subject  is  done  right). 

Determinants  do  have  one  important 
application  in  undergraduate  mathemat¬ 
ics,  namely,  in  computing  certain  vol¬ 
umes  and  integrals.  In  this  subsection 
we  interpret  the  meaning  of  the  sign  of 
the  determinant  on  a  real  vector  space.  Then  in  the  final  subsection  we  will 
use  the  linear  algebra  we  have  learned  to  make  clear  the  connection  between 
determinants  and  these  applications.  Thus  we  will  be  dealing  with  a  part  of 
analysis  that  uses  linear  algebra. 


Most  applied  mathematicians 
agree  that  determinants  should 
rarely  be  used  in  serious  numeric 
calculations. 
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We  will  begin  with  some  purely  linear  algebra  results  that  will  also  be 
useful  when  investigating  volumes.  Our  setting  will  be  inner  product  spaces. 
Recall  that  an  isometry  on  an  inner  product  space  is  an  operator  that  preserves 
norms.  The  next  result  shows  that  every  isometry  has  determinant  with 
absolute  value  1. 

10.45  Isometries  have  determinant  with  absolute  value  1 

Suppose  V  is  an  inner  product  space  and  S  e  C(V)  is  an  isometry.  Then 
|detS|  =  1. 

Proof  First  consider  the  case  where  V  is  a  complex  inner  product  space. 
Then  all  the  eigenvalues  of  S  have  absolute  value  1  (see  the  proof  of  7.43). 
Thus  the  product  of  the  eigenvalues  of  S ,  counting  multiplicity,  has  absolute 
value  one.  In  other  words,  |det  *S|  =  1,  as  desired. 

Now  suppose  V  is  a  real  inner  product  space.  We  present  two  different 
proofs  in  this  case. 

Proof  1 :  With  respect  to  the  inner  product  on  the  complexification  Vq  given 
by  Exercise  3  in  Section  9.B,  it  is  easy  to  see  that  Sc  is  an  isometry  on  Vq. 
Thus  by  the  complex  case  that  we  have  already  done,  we  have  |det  Sc  |  =  1 . 
By  definition  of  the  determinant  on  real  vector  spaces,  we  have  det  S  =  det  Sc 
and  thus  |det  S  |  =  1 ,  completing  the  proof. 

Proof  2:  By  9.36,  there  is  an  orthonormal  basis  of  V  with  respect  to  which 
Ad(S)  is  a  block  diagonal  matrix,  where  each  block  on  the  diagonal  is  a 
1-by-l  matrix  containing  1  or  —1  or  a  2-by-2  matrix  of  the  form 

(cos  9  —  sin  9 
sin  9  cos  6 

with  9  e  (0,  n).  Note  that  the  determinant  of  each  2-by-2  matrix  of  the  form 
above  equals  1  (because  cos2  9  +  sin2  9  =  1).  Thus  the  determinant  of  S, 
which  is  the  product  of  the  determinants  of  the  blocks  (see  Exercise  6),  is  the 
product  of  l’s  and  —  l’s.  Hence,  | det  S |  =  1,  as  desired.  ■ 

The  Real  Spectral  Theorem  7.29  states  that  a  self-adjoint  operator  T  on  a 
real  inner  product  space  has  an  orthonormal  basis  consisting  of  eigenvectors. 
With  respect  to  such  a  basis,  the  number  of  times  each  eigenvalue  appears  on 
the  diagonal  of  A 4(7")  is  its  multiplicity.  Thus  det  T  equals  the  product  of  its 
eigenvalues,  counting  multiplicity  (of  course,  this  holds  for  every  operator, 
self-adjoint  or  not,  on  a  complex  vector  space). 
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Recall  that  if  V  is  an  inner  product  space  and  T  £  C(V),  then  T*T  is  a 
positive  operator  and  hence  has  a  unique  positive  square  root,  denoted  \/T*T 
(see  7.35  and  7.36).  Because  ^T*T  is  positive,  all  its  eigenvalues  are  non¬ 
negative  (again,  see  7.35),  and  hence  det  *JT*T  >  0.  These  considerations 
play  a  role  in  next  example. 


10.46  Example  Suppose  V  is  a  real  inner  product  space  and  T  £  C(V ) 
is  invertible  (and  thus  det  T  is  either  positive  or  negative).  Attach  a  geometric 
meaning  to  the  sign  of  det  T. 

Solution  First  we  consider  an  isometry  S  £  C(V).  By  10.45,  the  determinant 
of  S  equals  1  or  —1.  Note  that 

{v  £  F  :  Sv  =  -v} 

is  the  eigenspace  £'(—1,5').  Thinking 
geometrically,  we  could  say  that  this 
is  the  subspace  on  which  S  reverses 
direction.  An  examination  of  proof 
2  of  10.45  shows  that  det  S  =  1  if 
this  subspace  has  even  dimension  and 
det  S  =  —  1  if  this  subspace  has  odd 
dimension. 

Returning  to  our  arbitrary  invertible  operator  T  £  £(F),  by  the  Polar 
Decomposition  (7.45)  there  is  an  isometry  S  £  C(V)  such  that 

T  =  SVt*T. 


We  are  not  formally  defining  the 
phrase  u reverses  direction”  be¬ 
cause  these  comments  are  meant 
only  as  an  intuitive  aid  to  our  un¬ 
derstanding. 


Now  10.44  tells  us  that 


det  T  =  (det  S) (det  VT*T) . 

The  remarks  just  before  this  example  pointed  out  that  det  VT*T  >  0.  Thus 
whether  det  T  is  positive  or  negative  depends  on  whether  det  S  is  positive  or 
negative.  As  we  saw  in  the  paragraph  above,  this  depends  on  whether  the 
subspace  on  which  S  reverses  direction  has  even  or  odd  dimension. 

Because  T  is  the  product  of  S  and  an  operator  that  never  reverses  direction 
(namely,  VT*T),  we  can  reasonably  say  that  whether  det  T  is  positive  or 
negative  depends  on  whether  T  reverses  vectors  an  even  or  an  odd  number  of 
times. 
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Volume 

The  next  result  will  be  a  key  tool  in  our  investigation  of  volume.  Recall  that 
our  remarks  before  Example  10.46  pointed  out  that  det  VT*T  >  0. 


10.47  | det  7" |  =  det  \[T*T 


Suppose  V  is  an  inner  product  space  and  T  e  jC(V).  Then 


|  det  r  |  =  det  Vf*T. 


Proof 

By  the  Polar  Decomposition  (7.45),  Another  proof  of  this  result  is  sug 

there  is  an  isometry  S  e  C(V)  such  gested  in  Exercise  8. 

that 

T  =  SVf*T. 

Thus 


|detT 


| det  jS |  det  \/T*T 

det  Vf*T, 


where  the  first  equality  follows  from  10.44  and  the  second  equality  follows 
from  10.45.  ■ 


Now  we  turn  to  the  question  of  volume  in  R".  Fix  a  positive  integer  n  for 
the  rest  of  this  subsection.  We  will  consider  only  the  real  inner  product  space 
R” ,  with  its  standard  inner  product. 

We  would  like  to  assign  to  each  subset  Q  of  R^  its  n -dimensional  volume 
(when  n  =  2,  this  is  usually  called  area  instead  of  volume).  We  begin  with 
boxes,  where  we  have  a  good  intuitive  notion  of  volume. 


io.48  Definition  box 

A  box  in  Rn  is  a  set  of  the  form 

{(ji,  •  •  • ,  yn)  e  R"  :  xj  <  yj  <  xj  +  rj  for  j  =  1 

where  r\ , . . . ,  rn  are  positive  numbers  and  (xi , . . . ,  xn )  e  R" .  The  num¬ 
bers  r\ , . . . ,  rn  are  called  the  side  lengths  of  the  box. 

You  should  verify  that  when  n  —  2,  a  box  is  a  rectangle  with  sides  parallel 
to  the  coordinate  axes,  and  that  when  n  =  3,  a  box  is  a  familiar  3-dimensional 
box  with  sides  parallel  to  the  coordinate  axes. 
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The  next  definition  fits  with  our  intuitive  notion  of  volume,  because  we 
define  the  volume  of  a  box  to  be  the  product  of  the  side  lengths  of  the  box. 

10.49  Definition  volume  of  a  box 

The  volume  of  a  box  B  in  Kn  with  side  lengths  r\, ...  ,rn  is  defined  to 

be  r\  •  •  •  rn  and  is  denoted  by  volume  B. 

To  define  the  volume  of  an  arbitrary 
set  £2  C  R" ,  the  idea  is  to  write  £2  as  a 
subset  of  a  union  of  many  small  boxes, 
then  add  up  the  volumes  of  these  small 
boxes.  As  we  approximate  £2  more  accurately  by  unions  of  small  boxes,  we 
get  a  better  estimate  of  volume  £2 . 

10.50  Definition  volume 

Suppose  £2  C  Rn.  Then  the  volume  of  £2,  denoted  volume  £2,  is  defined 

to  be  the  infimum  of 

volume  B\  +  volume  B2  +  •••  , 

where  the  infimum  is  taken  over  all  sequences  B\ ,  B2, . . .  of  boxes  in  Rn 

whose  union  contains  £2 . 

We  will  work  only  with  an  intuitive  notion  of  volume.  Our  purpose  in  this 
book  is  to  understand  linear  algebra,  whereas  notions  of  volume  belong  to 
analysis  (although  volume  is  intimately  connected  with  determinants,  as  we 
will  soon  see).  Thus  for  the  rest  of  this  section  we  will  rely  on  intuitive  notions 
of  volume  rather  than  on  a  rigorous  development,  although  we  shall  maintain 
our  usual  rigor  in  the  linear  algebra  parts  of  what  follows.  Everything  said 
here  about  volume  will  be  correct  if  appropriately  interpreted — the  intuitive 
approach  used  here  can  be  converted  into  appropriate  correct  definitions, 
correct  statements,  and  correct  proofs  using  the  machinery  of  analysis. 

10.51  Notation  T(£2) 

For  T  a  function  defined  on  a  set  £2,  define  T (£2)  by 

T(£2)  =  {T x  :  x  G  £2}. 

For  T  G  £(RW)  and  £2  C  R",  we  seek  a  formula  for  volume  T(£2)  in 
terms  of  T  and  volume  £2.  We  begin  by  looking  at  positive  operators. 


Readers  familiar  with  outer  mea¬ 
sure  will  recognize  that  concept 
here. 
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10.52  Positive  operators  change  volume  by  factor  of  determinant 
Suppose  T  e  C(Rn)  is  a  positive  operator  and  ^cRn.  Then 

volume  T(£2)  =  (det  T)  (volume  £2). 


Proof  To  get  a  feeling  for  why  this  result  is  true,  first  consider  the  special 
case  where  X\, ...  ,Xn  are  positive  numbers  and  T  e  £(RW)  is  defined  by 


(A  \X\,  .  .  .  ,  XyiXyi)  . 


This  operator  stretches  the  j th  standard  basis  vector  by  a  factor  of  Xj .  If  B 
is  a  box  in  with  side  lengths  r\, ...  ,rn,  then  T  ( B )  is  a  box  in  R^  with 
side  lengths  X\ r, ...  ,Xnr.  The  box  T(B )  thus  has  volume  X\  •  •  •  Xnr\  •  •  •  rn, 
whereas  the  box  £2  has  volume  r\  •  •  •  rn .  Note  that  det  T  =  X  \  •  •  •  Xn .  Thus 


volume  T(B)  =  (det  T)  (volume  B) 

for  every  box  B  in  Kn.  Because  the  volume  of  £2  is  approximated  by  sums  of 
volumes  of  boxes,  this  implies  that  volume  T(£2)  =  (det  T)  (volume  £2). 

Now  consider  an  arbitrary  positive  operator  T  e  £(R^).  By  the  Real 
Spectral  Theorem  (7.29),  there  exist  an  orthonormal  basis  e\ , . . . ,  en  of  Rn 
and  nonnegative  numbers  X\, ...  ,Xn  such  that  Tej  =  Xj  ej  for  j  =  1, . . . ,  n. 
In  the  special  case  where  e\ , . . . ,  en  is  the  standard  basis  of  Rw,  this  operator 
is  the  same  one  as  defined  in  the  paragraph  above.  For  an  arbitrary  orthonor¬ 
mal  basis  e\, ...  ,en,  this  operator  has  the  same  behavior  as  the  one  in  the 
paragraph  above — it  stretches  the  j th  basis  vector  in  an  orthonormal  basis  by 
a  factor  of  Xj .  Your  intuition  about  volume  should  convince  you  that  volume 
behaves  the  same  with  respect  to  each  orthonormal  basis.  That  intuition,  and 
the  special  case  of  the  paragraph  above,  should  convince  you  that  T  multiplies 
volume  by  a  factor  of  X  \  •  •  •  Xn ,  which  again  equals  det  T.  m 


Our  next  tool  is  the  following  result,  which  states  that  isometries  do  not 
change  volume. 


10.53  An  isometry  does  not  change  volume 

Suppose  S  G  £(RW)  is  an  isometry  and  fl  CR".  Then 


volume  S  (£2)  =  volume  £2. 
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Proof  For  x,y  e  ,  we  have 


||Sx-S>||  =  ||5,(x-j)|| 


In  other  words,  S  does  not  change  the  distance  between  points.  That  property 
alone  may  be  enough  to  convince  you  that  S  does  not  change  volume. 

However,  if  you  need  stronger  persuasion,  consider  the  complete  descrip¬ 
tion  of  isometries  on  real  inner  product  spaces  provided  by  9.36.  According  to 
9.36,  S  can  be  decomposed  into  pieces,  each  of  which  is  the  identity  on  some 
subspace  (which  clearly  does  not  change  volume)  or  multiplication  by  —1  on 
some  subspace  (which  again  clearly  does  not  change  volume)  or  a  rotation 
on  a  2-dimensional  subspace  (which  again  does  not  change  volume).  Or  use 
9.36  in  conjunction  with  Exercise  7  in  Section  9.B  to  write  S  as  a  product  of 
operators,  each  of  which  does  not  change  volume.  Either  way,  you  should  be 
convinced  that  S  does  not  change  volume.  ■ 


Now  we  can  prove  that  an  operator  T  e  £(RW)  changes  volume  by  a  factor 
of  |det  T\.  Note  the  huge  importance  of  the  Polar  Decomposition  in  the  proof. 


10.54  T  changes  volume  by  factor  of  |det  T 
Suppose  T  e  £(RW)  and  £2  C  Rn.  Then 

volume  7"  (£2)  =  |det  T  |  (volume  £2). 


Proof  By  the  Polar  Decomposition  (7.45),  there  is  an  isometry  S  e  C{V) 
such  that 

T  =  SVT*T. 

If  £2  C  Rn ,  then  T(£ 2)  =  Thus 

volume  T  (£2)  =  volume  S  (£2)) 

=  volume  Vr*r (£2) 

=  (det  V  T  *  T )  (volume  £2 ) 

=  |  det  T  |  (volume  £2 ) , 

where  the  second  equality  holds  because  volume  is  not  changed  by  the  isom¬ 
etry  S  (by  10.53),  the  third  equality  holds  by  10.52  (applied  to  the  positive 
operator  VT*T),  and  the  fourth  equality  holds  by  10.47.  ■ 
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The  result  that  we  just  proved  leads  to  the  appearance  of  determinants  in 
the  formula  for  change  of  variables  in  multivariable  integration.  To  describe 
this,  we  will  again  be  vague  and  intuitive. 

Throughout  this  book,  almost  all  the  functions  we  have  encountered  have 
been  linear.  Thus  please  be  aware  that  the  functions  /  and  cr  in  the  material 
below  are  not  assumed  to  be  linear. 

The  next  definition  aims  at  conveying  the  idea  of  the  integral;  it  is  not 
intended  as  a  rigorous  definition. 

10.55  Definition  integral ,  / 

If  Q  C  Rn  and  /  is  a  real-valued  function  on  Q,  then  the  integral  of  / 
over  Q,  denoted  f  or  f  (x)  dx ,  is  defined  by  breaking  Q  into  pieces 
small  enough  that  /  is  almost  constant  on  each  piece.  On  each  piece, 
multiply  the  (almost  constant)  value  of  /  by  the  volume  of  the  piece,  then 
add  up  these  numbers  for  all  the  pieces,  getting  an  approximation  to  the 
integral  that  becomes  more  accurate  as  Q  is  divided  into  finer  pieces. 


Actually,  Q  in  the  definition  above  needs  to  be  a  reasonable  set  (for 
example,  open  or  measurable)  and  /  needs  to  be  a  reasonable  function  (for 
example,  continuous  or  measurable),  but  we  will  not  worry  about  those 
technicalities.  Also,  notice  that  the  x  in  (x)  dx  is  a  dummy  variable  and 
could  be  replaced  with  any  other  symbol. 

Now  we  define  the  notions  of  differentiable  and  derivative.  Notice  that 
in  this  context,  the  derivative  is  an  operator,  not  a  number  as  in  one-variable 
calculus.  The  uniqueness  of  T  in  the  definition  below  is  left  as  Exercise  9. 


10.56  Definition  differentiable,  derivative,  o\x) 


Suppose  Q  is  an  open  subset  of  Kn  and  cr  is  a  function  from  Q  to  Rn. 
For  x  G  Q,  the  function  <j  is  called  differentiable  at  x  if  there  exists  an 
operator  T  e  £(RW)  su°h  that 


lim 

3^0 


ct(jc  +  ;f)— cr(x)  — 7>|| 

M 


If  a  is  differentiable  at  x,  then  the  unique  operator  T  e  £(RW)  satisfying 
the  equation  above  is  called  the  derivative  of  a  at  x  and  is  denoted  by 


328 


CHAPTER  10  Trace  and  Determinant 


The  idea  of  the  derivative  is  that 
for  x  fixed  and  \\y\\  small, 

a(x  +  y)  ss  o(x)  +  (a'(x))(j); 

because  <j'(x)  e  £(RW),  this  makes 
sense. 

Suppose  Q  is  an  open  subset  of  Rn  and  a  is  a  function  from  Q  to  Rn .  We 
can  write 

a(x)  =  (oi{x), . .  .,a„(xj), 

where  each  <j j  is  a  function  from  Q  to  R.  The  partial  derivative  of  cry 
with  respect  to  the  kth  coordinate  is  denoted  Dpfjj .  Evaluating  this  partial 
derivative  at  a  point  x  e  Q  gives  DpGj  (x).  If  cr  is  differentiable  at  x,  then  the 
matrix  of  <j\x)  with  respect  to  the  standard  basis  of  Rn  contains  D^fjj  (x)  in 
row  j ,  column  k  (this  is  left  as  an  exercise).  In  other  words, 

/  D\G\ (x)  . . .  Dno 'i(x)  \ 

10.57  M(a\x))  =  :  : 

\  Didn(x)  ...  Dnon{x)  ) 

Now  we  can  state  the  change  of  variables  integration  formula.  Some 
additional  mild  hypotheses  are  needed  for  /  and  <r'  (such  as  continuity  or 
measurability),  but  we  will  not  worry  about  them  because  the  proof  below  is 
really  a  pseudoproof  that  is  intended  to  convey  the  reason  the  result  is  true. 

The  result  below  is  called  a  change  of  variables  formula  because  you  can 
think  of  y  =  <r(x)  as  a  change  of  variables,  as  illustrated  by  the  two  examples 
that  follow  the  proof. 

10.58  Change  of  variables  in  an  integral 

Suppose  Q  is  an  open  subset  of  Rn  and  <j:  Q  Rn  is  differentiable  at 
every  point  of  Q.  If  /  is  a  real-valued  function  defined  on  <r(£2),  then 

f  f(y)dy=  f  f  (cr(x))|det<T/(x)|  dx. 

Jcr(^)  JO, 

Proof  Let  x  e  Q  and  let  T  be  a  small  subset  of  Q  containing  x  such  that  / 
is  approximately  equal  to  the  constant  / (cr(x))  on  the  set  cr(r). 

Adding  a  fixed  vector  [such  as  <r(x)]  to  each  vector  in  a  set  produces 
another  set  with  the  same  volume.  Thus  our  approximation  for  a  near  x  using 
the  derivative  shows  that 


If  n  —  \,  then  the  derivative  in  the 
sense  of  the  definition  above  is  the 
operator  on  R  of  multiplication  by 
the  derivative  in  the  usual  sense  of 
one-variable  calculus. 
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volume  cr  (r)  volume[(o'/(x))  (T)] . 

Using  10.54  applied  to  the  operator  or(x ),  this  becomes 

volumeo'(r)  %  |deto'/(x)|(volume T). 

Let  y  =  a(x).  Multiply  the  left  side  of  the  equation  above  by  / (y)  and  the 
right  side  by  / (cr(x))  [because  y  =  cr(x),  these  two  quantities  are  equal], 
getting 


/(y)  volume  o' (r)  %  /(o'(x))|deto'/(x)|(volumer). 

Now  break  Q  into  many  small  pieces  and  add  the  corresponding  versions  of 
the  equation  above,  getting  the  desired  result.  ■ 

The  key  point  when  making  a  change  of  variables  is  that  the  factor  of 
Ideto^x)!  must  be  included  when  making  a  substitution  y  =  /(x),  as  in  the 
right  side  of  10.58.  We  finish  up  by  illustrating  this  point  with  two  important 
examples. 


1 0.59  Example  polar  coordinates 

Define  <r :  R2  ->  R2  by 

<r(r,  9)  =  (r  cos  9,  r  sin  9 ), 


where  we  have  used  r,  9  as  the  coordinates  instead  of  xi,X2  for  reasons 
that  will  be  obvious  to  everyone  familiar  with  polar  coordinates  (and  will 
be  a  mystery  to  everyone  else).  For  this  choice  of  cr,  the  matrix  of  partial 
derivatives  corresponding  to  10.57  is 

(cos  9  —r  sin  9 
sin  9  r  cos  9 


as  you  should  verify.  The  determinant  of  the  matrix  above  equals  r,  thus 
explaining  why  a  factor  of  r  is  needed  when  computing  an  integral  in  polar 
coordinates. 

For  example,  note  the  extra  factor  of  r  in  the  following  familiar  formula 
involving  integrating  a  function  /  over  a  disk  in  R2: 


/(x,  y)  dy  dx 


n2jt  p  1 

I  /  / (r  cos  9,  r  sin  9)r  dr  d9 . 

o  Jo 
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1 0.60  Example  spherical  coordinates 

Define  <r :  R3  — »  R3  by 

cr(p,  cp,9)  =  (psin<^cos0,psin<^sin0,pcos<^), 


where  we  have  used  p,  9,  <p  as  the  coordinates  instead  of  x\ ,  X2,  X3  for  reasons 
that  will  be  obvious  to  everyone  familiar  with  spherical  coordinates  (and  will 
be  a  mystery  to  everyone  else).  For  this  choice  of  <r,  the  matrix  of  partial 
derivatives  corresponding  to  10.57  is 


(sin  cp  cos  0  p  cos  cp  cos  0  —p  sin  cp  sin  0 
sin  cp  sin  9  p  cos  cp  sin  9  p  sin  cp  cos  9 
cos  cp  —p  sin  (p  0 

as  you  should  verify.  The  determinant  of  the  matrix  above  equals  p2  sin<^, 
thus  explaining  why  a  factor  of  p2  sin  cp  is  needed  when  computing  an  integral 
in  spherical  coordinates. 

For  example,  note  the  extra  factor  of  p2  sin  cp  in  the  following  familiar 
formula  involving  integrating  a  function  /  over  a  ball  in  R3 : 

/l  p  V 1—  X2  n  1—  x1— y2 

/  _  /  _ f(x,y,z)dzdydx 

-1  J— \t l—x2  J—y/ 1  —x2—y2 


f  (p  sin  (p  cos  9 ,  p  sin  cp  sin  9 ,  p  cos  cp)p  sin  cp  dp  d<p  d9. 


EXERCISES  10. B 


1  Suppose  V  is  a  real  vector  space.  Suppose  T  e  C(V)  has  no  eigenvalues. 
Prove  that  det  T  >  0. 

2  Suppose  V  is  a  real  vector  space  with  even  dimension  and  T  e  C{V). 
Suppose  det  T  <  0.  Prove  that  T  has  at  least  two  distinct  eigenvalues. 

3  Suppose  T  g  C(V)  and  n  =  dim  V  >  2.  Let  Ai, . . . ,  Xn  denote  the 
eigenvalues  of  T  (or  of  7c  if  V  is  a  real  vector  space),  repeated  according 
to  multiplicity. 

(a)  Find  a  formula  for  the  coefficient  of  zn~ 2  in  the  characteristic 
polynomial  of  T  in  terms  of  A  \ , . . . ,  Xn . 

(b)  Find  a  formula  for  the  coefficient  of  z  in  the  characteristic  polyno¬ 
mial  of  T  in  terms  of  A  \ , . . . ,  Xn . 
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4  Suppose  T  G  C(V)  and  c  G  F.  Prove  that  det (cT)  =  cdimF  det  T. 

5  Prove  or  give  a  counterexample:  if  5,  T  G  £(T),  then  det(*S  +  T)  = 
det  S  +  det  T. 

6  Suppose  A  is  a  block  upper-triangular  matrix 

A\  *  ^ 

.  •> 

0  Am  / 

where  each  Aj  along  the  diagonal  is  a  square  matrix.  Prove  that 

det  A  =  (det  Ai)  •  •  •  (det^4m). 


7  Suppose  A  is  an  n-by-n  matrix  with  real  entries.  Let  S  G  C(Cn)  denote 
the  operator  on  Cn  whose  matrix  equals  A,  and  let  T  G  £(R")  denote 
the  operator  on  Rn  whose  matrix  equals  A.  Prove  that  trace  S  =  trace  T 
and  det  S  =  det  T. 


8  Suppose  V  is  an  inner  product  space  and  T  G  C(V).  Prove  that 

det  L*  =  det  T . 


Use  this  to  prove  that  |det  T 
was  given  in  10.47. 


det  Vf*T,  giving  a  different  proof  than 


9  Suppose  Q  is  an  open  subset  of  Rn  and  cr  is  a  function  from  Q  to  Rn . 
Suppose  x  G  Q  and  cr  is  differentiable  at  x.  Prove  that  the  operator 
T  G  £(Rn)  satisfying  the  equation  in  10.56  is  unique. 

[ This  exercise  shows  that  the  notation  <j'  (x)  is  justified .] 


10  Suppose  T  G  £(RW)  and  xgR".  Prove  that  T  is  differentiable  at  x  and 
T\x)  =  T. 


11  Find  a  suitable  hypothesis  on  a  and  then  prove  10.57. 

12  Let  a,  b ,  c  be  positive  numbers.  Find  the  volume  of  the  ellipsoid 

£  +  g  +  £<.} 

by  finding  a  set  Q  C  R3  whose  volume  you  know  and  an  operator 
T  G  £(R3)  such  that  T (£2)  equals  the  ellipsoid  above. 
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cubic  formula,  124 

degree  of  a  polynomial,  3 1 
derivative,  327 
Descartes,  Rene,  1 
determinant 

of  a  matrix,  314 
of  an  operator,  307 
diagonal  matrix,  155 
diagonal  of  a  square  matrix,  147 
diagonalizable,  156 
differentiable,  327 
differentiation  linear  map,  53,  56, 
59,61,62,  69,  72,  78, 
144,  190,  248,  294 
dimension,  44 

of  a  sum  of  subspaces,  47 
direct  sum,  21,  42,  93 
of  a  subspace  and  its 

orthogonal  complement, 
194 

of  null  Tn  and  range  Tn ,  243 
distributive  property,  3,  12,  16,  56, 
79 

Division  Algorithm  for 

Polynomials,  121 
division  of  complex  numbers,  4 
dot  product,  164 
double  dual  space,  116 
dual 

of  a  basis,  102 
of  a  linear  map,  103 
of  a  vector  space,  101 
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